Hilbert's Basis Theorem

7 min read

Ambitious goals seem to pop up whenever I have free time, but I have a poor track record of hitting them. To fill the extra time I have over the summer, I started reading Eisenbud's Commutative Algebra texbook as a side project. Let's hope I get far enough to learn something useful.

In this post, I want to write a bit about a result known as Hilbert's Basis Theorem (HBT). We'll motivate why we should care about it through a use case prevalent in algebraic geometry.

Motivating example

In algebraic geometry, one defines "shapes" and "curves" by taking a set of polynomials Sk[x1,,xr]S \subset k[x_1, \ldots, x_r] over an algebraically closed field kk and studying the set of points in krk^r where all the polynomials in SS are equal to zero: Z(S)={pkr:f(p)=0 for all fS}Z(S) = \{ p \in k^r : f(p) = 0~\text{for all}~f \in S\} We call such sets algebraic sets.

For example, if f=x2+y21R[x,y]f = x^2 + y^2 - 1 \in \mathbb{R}[x,y] then Z({f})Z(\{f\}) is the algebraic set consisting of points where x2+y2=1x^2 + y^2 = 1, which is precisely the unit circle.

Sets of polynomials seem pretty daunting; they can easily be uncountably large. Hence, at first glance characterizing these algebraic sets seems like a difficult task. However, Hilbert's basis theorem gives us a a very useful result which allows us to restrict our attention from arbitrary subsets of k[x1,,xr]k[x_1, \ldots, x_r] to just finitely generated ideals.

Corollary of Hilbert's basis theorem: Any algebraic set can be written as Z(I)Z(I) where Ik[x1,,xr]I \subset k[x_1, \ldots, x_r] is a finitely generated ideal.

This makes the problem significantly easier: since any fIf \in I can be represented using a finite basis f=i=1nkifif = \sum_{i=1}^n k_i f_i we only need to concern ourselves with the zero locus of a finite number of polynomials!

Statement of the theorem

To state the theorem in it's full generality, we will need the following definition. Let RR be a commutative ring.

Definition: RR is Noetherian if any ideal IRI \subset R is finitely generated.

An equivalent definition is the ascending chain condition: for any ascending chain of ideals of RR I1I2IkIk+1I_1 \subset I_2 \subset \cdots \subset I_k \subset I_{k+1} \subset \cdots there exists a nn after which In=In+1=In+2=I_n = I_{n+1} = I_{n+2} = \cdots

Hilbert's basis theorem says that adjoining elements to a Noetherian ring preserves the Noetherian property.

Theorem (Hilbert's basis theorem): If RR is Noetherian, then so is R[x]R[x].

Consequences

By induction, we get that:

Corollary: If RR is Noetherian, then so is R[x1,,xr]R[x_1, \ldots, x_r].

Noting that the only ideals of a field kk are {0}=0\{0\} = \langle 0 \rangle and k=1k = \langle 1 \rangle (if aIa \in I is non-zero, then since a1ka^{-1} \in k because II is an ideal we must have a1a=1Ia^{-1} a = 1 \in I, hence I1=kI \supset \langle 1 \rangle = k). As both of these are finitely generated:

Lemma: Any field kk is Noetherian

Together with Hilbert's basis theorem and the above corollary, we have

Corollary: k[x1,,xr]k[x_1, \ldots, x_r] is Noetherian; all of its ideals are finitely generated.

Our discussion up until now has focused only on ideals, whereas the motivating example was about algebraic sets defined by arbitrary subsets of k[x1,,xr]k[x_1, \ldots, x_r]. The following proposition ties the two together.

Proposition: For any SR[x1,,xr]S \subset R[x_1, \ldots, x_r], Z(S)=Z(S)Z(S) = Z(\langle S \rangle) where S\langle S \rangle is the ideal generated by SS.

Proof: Since SSS \subset \langle S \rangle, Z(S)Z(S)Z(S) \supset Z(\langle S \rangle) (aside: ZZ is a contravariant functor from the category consisting subsets of R[x1,,xr]R[x_1, \ldots, x_r] to the category consisting of algebraic sets on RrR^r, both with inclusions as arrows).

Conversely, if pZ(S)p \in Z(S) then evaluating any f=irifiSf = \sum_i r_i f_i \in \langle S \rangle at pp yields f(p)=irifi(p)=iri0=0f(p) = \sum_i r_i f_i(p) = \sum_i r_i 0 = 0 since the fiSf_i \in S are equal to zero for pSp \in S. \square

Proof of corollary in motivating example

Armed with these result, the proof of the corollary in the motivating example is swift.

Proof: An algebraic set is of the form Z(S)Z(S) where Sk[x1,,xr]S \subset k[x_1, \ldots, x_r]. By the previous proposition, Z(S)=Z(I)Z(S) = Z(I) where I=S)k[x1,,xr]I = \langle S \rangle) \subset k[x_1, \ldots, x_r] is an ideal. By the above corollary, k[x1,,xr]k[x_1, \ldots, x_r] is Noetherian hence II must be finitely generated. \square

Proving Hilbert's basis theorem

We've almost tied up all the loose ends in this discussion; all that remains is proving Hilbert's basis theorem itself. In this section, we will complete this last step.

Before we get there, we will need an alternate characterization of Noetherian rings.

Lemma: RR is Noetherian     \iff RR satisfies the ascending chain condition: for any ascending chain of ideals I1I2I3I_1 \subset I_2 \subset I_3 \subset \cdots there exists nNn \in \mathbb{N} after which the chain stabilizes, i.e. In=In+1=I_n = I_{n+1} = \cdots

Proof: Since I=iIiI = \cup_i I_i is an ideal of RR, it has a finite set of generators {ai}i=1k\{a_i\}_{i=1}^k. Each aiIjia_i \in I_{j_{i}} for some jiNj_{i} \in \mathbb{N}, so taking n=max1ikjin = \max_{1 \leq i \leq k} j_i suffices.

Conversely, define the ascending chain Ik=aii=1kI_k = \langle a_i \rangle_{i=1}^k where a0Ia_0 \in I and aiIIi1a_i \in I \setminus I_{i-1} are chosen arbitrarily. This chain stabilizes at some nn where I=In=aii=1kI = I_n = \langle a_i \rangle_{i=1}^k. \square

Proof of Hilbert's basis theorem: Let IR[x]I \subset R[x] be an ideal. We will construct a finite set of generators inductively. Choose f0If_0 \in I to be any element with least degree and for i1i \geq 1 pick fiIfkk=1i1f_i \in I \setminus \langle f_k \rangle_{k=1}^{i-1} any element with least degree. It remains to show that R[x]=fkk=1iR[x] = \langle f_k \rangle_{k=1}^{i} for some iNi \in \mathbb{N}. For each fif_i, we can write

fi=j=0degfiai,jxjf_i = \sum_{j=0}^{\deg f_i} a_{i,j} x^{j}

Consider the ascending chain of ideals Jk=(ai,degfi)i=1kJ_k = (a_{i, \deg f_i})_{i=1}^k generated by all the initial coefficients of the selected fif_i. This is an ascending chain, so J=aii=1mJ = \langle a_i \rangle_{i=1}^m for some mm. We claim that I=fii=1mI = \langle f_i \rangle_{i=1}^m.

To see this, let fm+1Ifkk=1i1f_{m+1} \in I \setminus \langle f_k \rangle_{k=1}^{i-1}. Then since am+1,degfm+1Ja_{m+1,\deg f_{m+1}} \in J, we have

am+1,degfm+1=j=1mrjaj,degfja_{m+1,\deg f_{m+1}} = \sum_{j=1}^m r_j a_{j,\deg f_j}

for some coefficients rjRr_j \in R. Since degfm+1degfmdegfm1\deg f_{m+1} \geq \deg f_m \geq \deg f_{m-1} \geq \cdots by how we chose these elements, we have that degfm+1degfj0\deg f_{m+1} - \deg f_j \geq 0 and therefore can define

g=j=1mujxdegfm+1degfjR[x]fjg = \sum_{j=1}^m \underbrace{u_j x^{\deg f_{m+1} - \deg f_j}}_{\in R[x]} f_j

By the definition of ideals, gfkk=1mg \in \langle f_k \rangle_{k=1}^{m}. Now notice fm+1gIf_{m+1} - g \in I (as the sum of two elements in II) but fm+1g∉fkk=1mf_{m+1} - g \not\in \langle f_k \rangle_{k=1}^{m} (otherwise adding gg would give that fm+1fkk=1mf_{m+1} \in \langle f_k \rangle_{k=1}^{m}), so fm+1gIfkk=1mf_{m+1} - g \in I \setminus \langle f_k \rangle_{k=1}^{m}

What's more, the initial coefficient of gg is given by (notation following Sedgewick and Flajolet)

[zdegg]g=[zdegfm+1]g=j=1muj[zdegfm+1]k=1degfjaj,kxdegfm+1degfj+k=j=1mujaj,degfj=am+1,degfm+1[z^{\deg g}] g = [z^{\deg f_{m+1}}] g = \sum_{j=1}^m u_j [z^{\deg f_{m+1}}] \sum_{k=1}^{\deg f_j} a_{j,k} x^{\deg f_{m+1} - \deg f_j + k} = \sum_{j=1}^m u_j a_{j,\deg f_j} = a_{m+1,\deg f_{m+1}}

Since the degree and initial coefficients of fm+1f_{m+1} and gg match, we have that deg(fm+1g)<degfm+1\deg(f_{m+1} - g) < \deg f_{m+1}, contradicting the choice of fm+1f_{m+1} having minimal degree amongst all polynomials in Ifkk=1mI \setminus \langle f_k \rangle_{k=1}^{m}. \square

Closing remarks

Our proof of Hilbert's basis theorem is almost identical to that in Eisenbud with some additional commentary and explanation. The technique of matching the initial term between two polynomials and arguing that the degree of the difference is strictly less is a common proof method when working with polynomials.

Most of our work was possible when polynomial coefficients were over a ring RR. We only needed coefficients to be from a field kk so that we could use kk being Noetherian to conclude the motivating example.

While we only showed that Z(S)=Z(S)Z(S) = Z(\langle S \rangle) for any subset Sk[x1,,xr]S \subset k[x_1, \ldots, x_r], the connection between algebraic sets (our geometric objects) and ideals (our algebraic objects) is much deeper. In fact, it's not terribly hard to show that Z(I(X))=XZ(I(X)) = \overline{X} for any XkrX \subset k^r, where I(X)={fk[x1,,xr]:f(p)=0 for all pX}I(X) = \{f \in k[x_1, \ldots, x_r] : f(p) = 0~\text{for all}~p \in X\} (exercise: prove this is an ideal) and X\overline{X} is the smallest algebraic set containing XX (i.e. the closure of XX under the Zariski topology).

Here, one may be tempted to conclude that algebraic sets of krk^r and ideals of k[x1,,xr]k[x_1, \ldots, x_r] are in correspondence, but this is false. The precise statement here is Hilbert's Nullstellensatz (which we will likely have more to say about later):

The correspondence IZ(I)I \mapsto Z(I) and XI(X)X \mapsto I(X) induces a bijection between the algebraic sets of krk^r and the radical ideals of k[x1,,xr]k[x_1, \ldots, x_r].