8 Galois Groups of polynomials
8.1 Symmetric functions
Suppose we have a field \(k\) of characteristic \(0\) and integer \(n\geq 1\) (actually, everything in this section will work in positive characteristic \(p\) as long as \(p\) doesn’t divide \(n!\)). Let \(L=k(x_1,x_2,...,x_n)\) be the field of rational expressions in independent indeterminates \(x_1,x_2,...,x_n\) with coefficients in \(k\). We can make the symmetric group \(S_n\) act on \(L\) by permuting the variables: given \(\pi\in S_n\), we let \(x_i\mapsto x_{\pi(i)}\). Specifically, if \[\pi=\begin{pmatrix} 1 & 2 & \cdots & n \\ i_1 & i_2 & \cdots & i_n\end{pmatrix}\] then we have a \(k\)-automorphism \(\pi:L\to L\) given by \[\frac{f(x_1,x_2,...,x_n)}{g(x_1,x_2,...,x_n)}\longmapsto \frac{f(x_{i_1},x_{i_2},...,x_{i_n})}{g(x_{i_1},x_{i_2},...,x_{i_n})} \] Consider the fixed field \(L^{S_n}\) of \(L\) under this group of automorphisms \(S_n\). These are the symmetric functions in \(L\) and we’d like to fully describe them and the field extension \(L/L^{S_n}\).
Define the elementary symmetric polynomials \(e_r\in L\) for \(1\leq r\leq n\) by \[\begin{align*} e_1&=\sum_{1\leq i\leq n}x_i =x_1+x_2+\cdots+x_n\\ e_2&=\sum_{1\leq i<j\leq n}x_ix_j \\ &\vdots \\ e_r&=\sum_{1\leq i_1<...<i_r\leq n}x_{i_1}x_{i_2}\cdots x_{i_r} \\ &\vdots \\ e_n&=x_1x_2\cdots x_n \end{align*}\] Clearly, \(e_1,e_2,...,e_n\in L^{S_n}\) and if we set \(K=k(e_1,e_2,...,e_n)\) then \[K=k(e_1,e_2,...,e_n) \;\subset\; L^{S_n} \;\subset\; L=k(x_1,x_2,...,x_n).\]
Theorem 8.1 We have \(K=L^{S_n}\) and \(L/K\) is Galois with Galois group \(S_n\).
Proof. First notice that \[f(x)=(x-x_1)(x-x_2)\cdots(x-x_n)=x^n-e_1x^{n-1}+e_2x^{n-2}+\cdots+(-1)^ne_n.\] Now \(f(x)\in K[x]\) and \(L\) is a splitting field of \(f(x)\) over \(K\), since \(L\) contains all the roots of \(f(x)\) and no smaller field does. Furthermore, \(\deg f(x)=n\), so by Theorem 4.1 (the existence of splitting fields), we thus have \[ [L:K]\leq n!\] However, the extension \(L\) is a splitting field over \(K\) in characteristic \(0\) so is Galois. That means \(L/L^{S_n}\) is also Galois with Galois group \(S_n\) (where \(S_n\) acts as described above) and \[[L:K]\geq [L:L^{S_n}]=|S_n|=n!\] We deduce that \(K=L^{S_n}\) and \(\Gal(L/K)=S_n\).
Remark. \((a)\;\) Since any finite group \(G\) is a subgroup of \(S_n\) for some \(n\), we can now say there is a Galois extension with Galois group \(G\). Indeed, with \(L=k(x_1,...,x_n)\), we can let \(G\subset S_n\) act on \(L\) as above and then the fixed field \(L/L^G\) is Galois with Galois group \(G\). Note this is a lot easier than the unsolved Inverse Galois Problem which asks whether or not every finite group appears as the Galois group of some Galois extension of \(\mathbb{Q}\).
\((b)\;\) Given a field \(K\) and polynomial \(f(x)\in K[x]\), we define the Galois group \(G_f\) of \(f(x)\) to be the Galois group of the splitting field of \(f(x)\) over \(K\) (provided this extension is separable). If \(\deg f(x)=n\), this group acts by permuting the roots \(\theta_1,...,\theta_n\) of \(f(x)\) so can be considered as a subgroup of \(S_n\). However, there may be non-trivial relationships between these roots (unlike the \(x_1,...,x_n\) treated above) and so we may find \(G_f\) is a proper subgroup of \(S_n\).
It’s easy to see when a polynomial in \(k[x_1,x_2,...,x_n]\) is symmetric - if it contains a monomial \(cx_1^{a_1}\cdots c_n^{a_n}\) then check it contains all the permutations \(cx_1^{a_{i_1}}\cdots x_n^{a_{i_{n\phantom{1}}}}\) with the same coefficient \(c\). For the purposes of Galois Theory, it’s also useful to express various symmetric polynomials in terms of the elementary ones and the above Theorem implies the following famous result:
Corollary 8.1 Any symmetric polynomial in \(k[x_1,x_2,...,x_n]\) can be expressed as a polynomial in elementary symmetric polynomials, i.e. \[ k[x_1,x_2,...,x_n]^{S_n}=k[e_1,e_2,...,e_n].\]
When the number of variables and polynomial degrees are not too large, one can do ad hoc calculations. For instance, when \(n=2\), we can express \[\begin{align*} e_1^2&=(x_1+x_2)^2=x_1^2+2x_1x_2+x_2^2 = x_1^2+x_2^2+2e_2 \\ &\implies\quad x_1^2+x_2^2= e_1^2-2e_2. \end{align*}\] and \[\begin{align*} e_1^3&=(x_1+x_2)^3=x_1^3+3x_1^2x_2+3x_1x_2^2+x_2^3 = x_1^3+x_2^3+3(x_1+x_2)x_1x_2 \\ &\implies\quad x_1^3+x_2^3= e_1^3-3e_1e_2. \end{align*}\] Similarly, when \(n=3\), \[\begin{align*} e_1e_2&=(x_1+x_2+x_3)(x_1x_2+x_2x_3+x_3x_1) \\ &=x_1^2x_2+x_1x_2^2+x_2^2x_3+x_2x_3^2+x_3^2x_1+x_3x_1^2+3x_1x_2x_3 \\[5pt] &\implies\quad x_1^2x_2+x_1x_2^2+x_2^2x_3+x_2x_3^2+x_3^2x_1+x_3x_1^2=e_1e_2-3e_3. \end{align*}\]
For general \(n\) and symmetric polynomial \(f=f(x_1,...,x_n)\in k[x_1,...,x_n]^{S_n}\), there is an efficient algorithm.
Define the lexicographic ordering of monomials: \(x_1^{a_1}\cdots x_n^{a_n}\underset{lex}{>}x_1^{b_1}\cdots x_n^{b_n}\) if and only if there exists \(0\leq j\leq n-1\) such that \(a_1=b_1\), \(a_2=b_2\), … \(a_{j}=b_{j}\) and \(a_{j+1}>b_{j+1}\). In other words, give priority to the \(x_1\) power, then the \(x_2\) power, and so on… and it’s just like ordering words in a dictionary. For example, we have \(x_1x_2^3x_2^2\underset{lex}{>}x_1x_2x_3^4\).
Define the leading term of \(f\) to be the biggest monomial \(cx_1^{a_1}\cdots x_n^{a_n}\) with \(c\neq 0\) in \(f(x)\) with respect to this lexicographic ordering. Notice when \(f\) is symmetric, this leading term automatically has \(a_1\geq a_2\geq\cdots\geq a_n\).
The Algorithm now proceeds as follows:
Step 1: Given symmetric \(f=f(x_1,...,x_n)\), find the leading term \(cx_1^{a_1}\cdots x_n^{a_n}\) and calculate \[f_1=f-ce_1^{a_1-a_2}e_2^{a_2-a_3}\cdots e_{n-1}^{a_{n-1}-a_n}e_n^{a_n}.\] Notice \(f_1\) is symmetric and its leading term is strictly smaller than the leading term of \(f\).
Step 2: If \(f_1\neq 0\), then apply Step 1 to \(f_1\) to get \(f_2\) and so on.
Since the size of the leading terms of \(f_1, f_2,...\) is strictly decreasing, the algorithm eventually terminates and we end up with an expression for \(f(x_1,...,x_n)\) as a polynomial in \(e_1,...,e_n\).
For instance, with \(n=2\) and \(f=f(x_1,x_2)=x_1^3+x_2^3\) we obtain the following:
Leading term of \(f\) is \(x_1^3=x_1^3x_2^0\) so \[f_1=f-e_1^{3-0}e_2^0=x_1^3+x_2^3-(x_1+x_2)^3=-3x_1^2x_2-3x_1x_2^2.\]
Leading term of \(f_1\) is \(-3x_1^2x_2\) so \[f_2=f_1-(-3)e_1^{2-1}e_2^1=-3x_1^2x_2-3x_1x_2^2+3(x_1+x_2)x_1x_2=0.\]
Rearranging gives \(f_1=-3e_1e_2\) and \(f=e_1^3+f_1=e_1^3-3e_1e_2\).
Explicit computations when \(n=3\)
We’ll now find some special symmetric polynomials which will be used when solving and finding Galois groups of general cubic polynomials. Assume that our base field \(k\) contains a primitive \(3\)-rd root of unity \(\omega\) (so that \(\omega+\omega^2=-1\)) and consider \[\begin{align*} \theta_1&=(x_1+\omega x_2+\omega^2 x_3)/3, \\ \theta_2&=(x_1+\omega^2 x_2+\omega x_3)/3. \end{align*}\] These are not symmetric polynomials in \(k[x_1,x_2,x_3]\). However, \(\theta_1\theta_2\) and \(\theta_1^3+\theta_2^3\) are symmetric, as can be seen by direct computation: \[\begin{align*} 9\theta_1\theta_2&=x_1^2+x_2^2+x_3^2-(x_1x_2+x_2x_3+x_3x_1) \\ &=(x_1+x_2+x_3)^2-3(x_1x_2+x_2x_3+x_3x_1)=e_1^2-3e_2 \end{align*}\] Similarly, \[\begin{align*} 27\theta_1^3&=x_1^3+x_2^3+x_3^3+3\omega(x_1^2x_2+x_2^2x_3+x_3^2x_1)+3\omega^2(x_1x_2^2+x_2x_3^2+x_3x_1^2)+6x_1x_2x_3 \\ 27\theta_2^3&=x_1^3+x_2^3+x_3^3+3\omega^2(x_1^2x_2+x_2^2x_3+x_3^2x_1)+3\omega(x_1x_2^2+x_2x_3^2+x_3x_1^2)+6x_1x_2x_3 \end{align*}\] and so \(27(\theta_1^3+\theta_2^3)\) equals \[2(x_1^3+x_2^3+x_3^3)-3(x_1^2x_2+x_1x_2^2+x_2^2x_2+x_2x_2^2+x_3^2x_1+x_3x_1^2)+12x_1x_2x_3.\] Now apply the algorithm to show this equals \(2e_1^3-9e_1e_2+27e_3\). We leave this as an exercise. So far, we have \[\theta_1\theta_2=\frac{e_1^2-3e_2}{9}\qquad\text{and}\qquad \theta_1^3+\theta_2^3=\frac{2e_1^3-9e_1e_2+27e_3}{27}.\] Also, from the above \[\begin{align*} 27(\theta_1^3-\theta_2^3)&=3(\omega-\omega^2)(x_1^2x_2+x_2^2x_3+x_3^2x_1-x_1x_2^2-x_2x_3^2-x_3x_1^2) \\ &=-3\sqrt{-3}(x_1-x_2)(x_2-x_3)(x_3-x_1) \end{align*}\] and so we have another interesting symmetric polynomial \[\begin{align*} (x_1-x_2)^2(x_2-x_3)^2(x_3-x_1)^2 &=-27(\theta_1^3-\theta_2^3)^2 =-27\left[ (\theta_1^3+\theta_2^3)^2-4\theta_1^3\theta_2^3 \right] \\ &=-27\left[ \left(\frac{2e_1^3-9e_1e_2+27e_3}{27}\right)^2-4\left(\frac{e_1^2-3e_2}{9}\right)^3 \right] \\ &=18e_1e_2e_3-4e_1^3e_3+e_1^2e_2^2-4e_2^3-27e_3^2. \end{align*}\]
Explicit computations when \(n=4\)
Now for some special symmetric polynomials used when solving and finding Galois groups of general quartic polynomials. Consider the elements \[\begin{align*} \theta_1&=(x_1+x_2-x_3-x_4)/2, \\ \theta_2&=(x_1-x_2+x_3-x_4)/2, \\ \theta_3&=(x_1-x_2-x_3+x_4)/2. \end{align*}\] Again, these are not symmetric polynomials in \(k[x_1,x_2,x_3,x_4]\) but they do give rise to three polynomials which are symmetric, namely \(\theta_1^2+\theta_2^2+\theta_3^2\), \(\theta_1\theta_2\theta_3\) and \(\theta_1^2\,\theta_2^2+\theta_2^2\,\theta_3^2+\theta_3^2\,\theta_1^2\). Check that these are indeed symmetric and apply the algorithm to find \[\theta_1^2+\theta_2^2+\theta_3^2=\frac{3e_1^2-8e_2}{4},\qquad\theta_1\theta_2\theta_3=\frac{e_1^3-4e_1e_2+8e_3}{8}\] \[\text{and}\qquad \theta_1^2\,\theta_2^2+\theta_2^2\,\theta_3^2+\theta_3^2\,\theta_1^2 =\frac{3e_1^4-16e_1^2e_2+16e_1e_3+16e_2^2-64e_4}{16}.\]
We leave this as an exercise for the keen/bored reader. As you might imagine, it takes a fair bit of elementary but tedious calculation, particularly for the last one which is an identity between several degree \(4\) polynomials in \(4\) variables. One way to speed up the work is to introduce auxiliary polynomials defined by \(4\psi_j=e_1^2-4\theta_j^2\) for \(j=1,2,3\). Check that \[\begin{align*} \psi_1&=(x_1+x_2)(x_3+x_4), \\ \psi_2&=(x_1+x_3)(x_2+x_4), \\ \psi_3&=(x_1+x_4)(x_2+x_3). \end{align*}\] and then \(16(\theta_1^2\,\theta_2^2+\theta_2^2\,\theta_3^2+\theta_3^2\theta_1^2)\) equals \[\begin{multline*} \qquad (e_1^2-4\psi_1)(e_1^2-4\psi_2)+(e_1^2-4\psi_1)(e_1^2-4\psi_3)+(e_1^2-4\psi_2)(e_1^2-4\psi_3) \\ =3e_1^4-8e_1^2(\psi_1+\psi_2+\psi_3)+16(\psi_1\psi_2+\psi_2\psi_3+\psi_3\psi_1).\qquad \end{multline*}\] One then finds \(\psi_1+\psi_2+\psi_3=2e_2\) easily and computes \[\psi_1\psi_2+\psi_2\psi_3+\psi_3\psi_1=e_1e_3+e_2^2-4e_4\] via the algorithm.
Remark. Of course, once we have an identity involving elementary symmetric polynomials as above (whether we find it by hand or by using a computer or by looking in a book), it’s easy to check by high school algebra or just accept it and move on to apply it however we wish. That’s what we’ll do in the next sections where we solve and find Galois groups of general cubic and quartic polynomials. Fortunately, there will be simplifications which mean we don’t have to memorise all these formulas (though there are some things we will have to learn…!)
8.2 Galois Theory for cubic polynomials
Given a field \(k\) and polynomial \(f(x)\in k[x]\), the Galois group \(G_f\) is by definition the Galois group of a splitting field \(L\) of \(f(x)\) over \(k\). We’ll always assume that \(f(x)\) has distinct roots so that the extensions \(L/K\) is separable and hence Galois. If \(\deg f(x)=n\), then we can embed \(G_f\) in the symmetric group \(S_n\) as \(G_f\) permutes the roots of \(f(x)\). As well as the symmetric polynomials discussed in the last section, one of the key tools is the Lagrange resolvent construction which allows us to describe cyclic extensions explicitly. We recall that if a field \(K\) contains a primitive \(n\)-th root of unity \(\zeta\) and \(\Gal(L/K)=\langle\,\sigma\,\rangle\cong\mathbb{Z}/n\), then there is an element \(b\in L\) such that the Lagrange resolvent \[\theta_b=b+\zeta^{-1}\sigma(b)+\zeta^{-2}\sigma^2(b)+\cdots+\zeta^{-(n-1)}\sigma^{n-1}(b)\neq 0.\] In that case, \(L=K(\theta_b)\) and \(\theta_b^n=a\in K\).
Before beginning cubic polynomials, let’s first re-examine the case of quadratic polynomials from this point of view. Let \(k\) be a field with characteristic not equal to \(2\). A general quadratic polynomial can be written \[f(x)=x^2-e_1x+e_2=(x-x_1)(x-x_2)\in K[x]\] where \(e_1=x_1+x_2\), \(e_2=x_1x_2\) lie in \(K=k(e_1,e_2)\). As in the last section, the field extension \[K=k(e_1,e_2) \;\subset\; L=k(x_1,x_2)=K(x_1)\] is Galois and we have \(\Gal(L/K)=\{\id,\sigma\}\cong S_2\cong\mathbb{Z}/2\) where \(\sigma(x_1)=x_2\) and \(\sigma(x_2)=x_1\). Since \(L/K\) is cyclic and \(\zeta_2=-1\in K\), the Lagrange resolvent of \(x_1\) is \[\theta=\theta_{x_1}=x_1+\zeta_2^{-1}\sigma(x_1)=x_1-x_2.\] Clearly, \(\sigma(\theta)=-\theta\) and \[\begin{align*} \theta^2&=(x_1-x_2)^2=x_1^2+x_2^2-2x_1x_2=(x_1+x_2)^2-4x_1x_2 \\ &=e_1^2-4e_2\in K \end{align*}\] The element \(\Delta=(x_1-x_2)^2=e_1^2-4e_2\) is the discriminant of \(f(x)\). It detects when the roots \(x_1,x_2\) are equal, but here they are independent variables. We can recover the roots \(x_1\), \(x_2\) from \(e_1\), \(e_2\) via relations \[x_1+x_2=e_1\qquad\text{and}\qquad x_1-x_2=\theta=\sqrt{\Delta}.\] We find the familiar formulas \(x_1=(e_1+\sqrt{\Delta})/2\) and \(x_2=(e_1-\sqrt{\Delta})/2\).
Finally, given an arbitrary irreducible quadratic polynomial \(f(x)=x^2+ax+b\in k[x]\) (which necessarily has distinct roots), the above allows us to find them \[\frac{-a\pm\sqrt{b^2-4a}}{2}\] as well as showing the Galois group \(G_f\) is isomorphic to \(\mathbb{Z}/2\).
Solving cubic equations
Now let \(k\) be a field with characteristic not \(2\) or \(3\) and consider a general cubic polynomial \[f(x)=x^3-e_1x^2+e_2x-e_3=(x-x_1)(x-x_2)(x-x_3)\in K[x]\] where \(e_1=x_1+x_2+x_3\), \(e_2=x_1x_2+x_2x_3+x_3x_1\), \(e_3=x_1x_2x_3\) lie in \[K=k(e_1,e_2,e_3)\subset L=K(x_1,x_2,x_3).\] As in the last section, we have \(\Gal(L/K)=S_3\) which is easily described as the symmetries of an equilateral triangle with vertices \(\{1,2,3\}\). It contains the normal subgroup \(A_3=\mathbb{Z}/3\) of rotations \[\begin{pmatrix} 1 & 2 & 3 \\ 1 & 2 & 3\end{pmatrix}=\id,\quad \begin{pmatrix} 1 & 2 & 3 \\ 2 & 3 & 1\end{pmatrix}=(123),\quad \begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2\end{pmatrix}=(132), \] as well as the coset \(S_3\setminus A_3=(12)A_3\) of \(A_3\) \[\begin{pmatrix} 1 & 2 & 3 \\ 2 & 1 & 3\end{pmatrix}=(12),\quad \begin{pmatrix} 1 & 2 & 3 \\ 1 & 3 & 2\end{pmatrix}=(23),\quad \begin{pmatrix} 1 & 2 & 3 \\ 3 & 2 & 1\end{pmatrix}=(13). \] There’s a tower of extensions \(K\subset M=L^{A_3}\subset L\) where \(\Gal(L/M)\cong A_3=\mathbb{Z}/3\) and \(\Gal(M/K)\cong\mathbb{Z}/2\).
Assume that our base field \(k\) contains a primitive \(3\)-rd root of unity \(\omega\), and then \(\omega^2\) is also a primitive \(3\)-rd root of unity. Each of these gives a Lagrange resolvent for \(x_1\) which we can use to describe the cyclic extension \(L/M\). Define \[\begin{align*} \theta_1 &= (x_1+\omega x_2+ \omega^2 x_3)/3 \\ \theta_2 &= (x_1+\omega^2 x_2+ \omega x_3)/3. \end{align*}\] Then \(\theta_1^3=t_1\) and \(\theta_2^3=t_2\) are in \(M\) and \[L=M(\theta_1)=M(\sqrt[3]{t_1})\qquad\text{and}\qquad L=M(\theta_2)=M(\sqrt[3]{t_2}).\] Using the calculations from the last section, we have \[27(\theta_1^3+\theta_2^3)=2e_1^3-9e_1e_2+27e_3,\qquad 9\theta_1\theta_2=e_1^2-3e_2\] Finally, \(t_1=\theta_1^3\) and \(t_2=\theta_2^3\) are roots of the quadratic resolvent of \(f(x)\) \[(t-t_1)(t-t_2)= t^2-\left(\frac{2e_1^3-9e_1e_2+27e_3}{27}\right)t+\left(\frac{e_1^2-3e_2}{9}\right)^3. \] We can now find the roots \(x_1\), \(x_2\), \(x_3\) of \(f(x)\) as follows:
Step 1: Solve the above quadratic resolvent to find \(t_1\), \(t_2\).
Step 2: Choose a cube root \(\theta_1=\sqrt[3]{t_1}\) and let \(\theta_2\) satisfy \(9\theta_1\theta_2=e_1^2-3e_2\) (then automatically \(\theta_2^3=t_2\)).
Step 3: Finally, solve the following system of three linear equations \[ \begin{cases} x_1+\hspace{1em} x_2+\hspace{1em} x_3 &= e_1 \\ x_1+\hspace{0.4em} \omega x_2+ \omega^2 x_3 &= 3\theta_1 \\ x_1+\omega^2 x_2+\hspace{0.4em} \omega x_3 &= 3\theta_2 \end{cases} \quad\implies\quad \begin{cases} x_1 = e_1/3+\theta_1+\theta_2 \\ x_2 = e_1/3+\omega^2\theta_1+\omega\theta_2 \\ x_3 = e_1/3+\omega\theta_1+\omega^2\theta_2 \\ \end{cases} \]
Remark. In practice, when finding roots of a general cubic \(f(x)=x^3+ax^2+bx+c\), one doesn’t use the above formulas directly. Instead, first perform a shift to kill the coefficient of \(x^2\): \[\begin{align*} f(x-a/3)&=(x-a/3)^3+a(x-a/3)^2+b(x-a/3)+c \\ &=x^3+px+q. \end{align*}\] For this reduced cubic \(x^3+px+q\), the formulas simplify since \(e_1=0\), \(e_2=p\), \(e_3=-q\):
The quadratic resolvent is \(t^2+qt-p^3/27\).
The roots of this are \(t_1=\theta_1^3\), \(t_2=\theta_2^3\) and we choose cube roots so that \(\theta_1\theta_2=-p/3\).
The roots of \(f(x-a/3)\) are \(x_1=\theta_1+\theta_2\), \(x_2=\omega^2\theta_1+\omega\theta_2\), \(\omega\theta_1+\omega^2\theta_2\) and we recover the roots of \(f(x)\) by subtracting \(a/3\) from \(x_1,x_2,x_3\).
Galois groups of cubic polynomials
Let \(K\) be a field with characteristic not \(2\) or \(3\) and \[f(x)=x^3+ax^2+bx+c\in K[x].\] Then the Galois group is \(G_f=\Gal(L/K)\) where \(L\) is a splitting field for \(f(x)\) over \(K\) and this is a subgroup of \(S_3\). Let \(\alpha_1,\alpha_2,\alpha_3\) be the roots of \(f(x)\) in \(L\). Now \(f(x)\) may be reducible in \(K[x]\):
\(f(x)=(x-\alpha_1)(x-\alpha_2)(x-\alpha_3)\) where \(\alpha_1,\alpha_2,\alpha_3\in K\). Then clearly \(L=K\) and \(G_f=\{\id\}\).
\(f(x)=(x-\alpha_j)g(x)\) where \(\alpha_j\in K\) and \(g(x)\) is an irreducible quadratic in \(K[x]\). As we’ve seen, that means \([L:K]=2\) and \(G_f\cong\mathbb{Z}/2\).
If \(f(x)\) is irreducible in \(K[x]\), then \(K\subset K(\alpha_1)\subset L\) where \([K(\alpha_1):K]=3\). Then
either \([L:K(\alpha_1)]=1\implies[L:K]=3\) and so \(G_f\cong A_3=\mathbb{Z}/3\)
or \([L:K(\alpha_1)]=2\implies[L:K]=6\) and so \(G_f\cong S_3\).
How can we easily distinguish between these two cases? Introduce the following important element \[\delta=(\alpha_1-\alpha_2)(\alpha_2-\alpha_3)(\alpha_3-\alpha_1)\in L.\] We note that \(\delta\neq 0\) as we’ve assumed the roots are distinct.
Suppose \(G_f\cong A_3\). Clearly this is generated by \(\sigma\) which cycles the roots: \[\sigma(\alpha_1)=\alpha_2,\qquad\sigma(\alpha_2)=\alpha_3,\qquad\sigma(\alpha_3)=\alpha_1.\] This implies that \(\sigma(\delta)=\delta\) and so \(\delta\in L^{G_f}=K\).
Suppose \(G_f\cong S_3\). We still have \(\sigma(\delta)=\delta\) for each \(\sigma\in A_3\subset S_3\). However, now \(\sigma(\delta)=-\delta\) if \(\sigma\in S_3\setminus A_3\) and so \(\delta\not\in K\).
The discriminant of \(f(x)\) is defined by \[\Delta=\delta^2=(\alpha_1-\alpha_2)^2(\alpha_2-\alpha_3)^2(\alpha_3-\alpha_1)^2\] and we have proved the following result:
Theorem 8.2 Suppose that \(f(x)\in K[x]\) is irreducible of degree \(3\). Then
\(\;\;(i)\) we have \(G_f\cong A_3\;\;\iff\;\;\Delta\in {K^\times}^2\),
\(\;(ii)\) we have \(G_f\cong S_3\;\;\iff\;\;\Delta\in K^\times\setminus {K^\times}^2\).
This doesn’t appear to be so useful, as we apparently need to find the roots first. However, in the section on symmetric functions we found \[(x_1-x_2)^2(x_2-x_3)^2(x_3-x_1)^2 =18e_1e_2e_3-4e_1^3e_3+e_1^2e_2^2-4e_2^3-27e_3^2.\] Using this (and \(e_1=-a\), \(e_2=b\), \(e_3=-c\)) we can express \(\Delta\) in terms of the coefficients of \(f(x)\).
Theorem 8.3 For a general cubic polynomial \(f(x)=x^3+ax^2+bx+c\in K[x]\), the discriminant is \[\Delta=18abc-4a^3c+a^2b^2-4b^3-27c^2.\] For a reduced cubic polynomial \(f(x)=x^3+px+q\), this simplifies to \(\Delta=-4p^3-27q^2\).
Example 8.1 Find the Galois group \(G_f\) of the polynomial \(f(x)=x^3+x^2-4x+1\in\mathbb{Q}[x]\).
Check that the polynomial is irreducible (show there are no linear factors by the rational root test). Then substituting \(a=1\), \(b=-4\), \(c=1\) into the formula gives \(\Delta=169\). Since \(\Delta=13^2\in{\mathbb{Q}^\times}^2\), we have found that \(G_f\cong A_3=\mathbb{Z}/3\).
Remark. In practice, it’s easier to reduce the cubic first, then use the simpler formula for the discriminant. \[\begin{align*} f\left(x-\frac{1}{3}\right)&=\left(x-\frac{1}{3}\right)^3+\left(x-\frac{1}{3}\right)^2-4\left(x-\frac{1}{3}\right)+1 \\ &=x^3-\frac{13}{3}x+\frac{65}{27} \end{align*}\] Notice that this will have the same discriminant as \(f(x)\). Indeed, shifting each root \(\alpha_1,\alpha_2,\alpha_3\) by \(1/3\) will not change \[\Delta=(\alpha_1-\alpha_2)^2(\alpha_2-\alpha_3)^2(\alpha_3-\alpha_1)^2.\] In particular, \[\Delta=-4\left(-\frac{13}{3}\right)^3-27\left(\frac{65}{27}\right)^2=\frac{13^2}{3^3}(4\cdot 13-25)=13^2.\]
If you did Exercise 6.9(f) correctly, then the polynomial \(f(x)=x^3+x^2-4x+1\) might look familiar. Its roots lie in the (unique) cubic subfield of \(\mathbb{Q}(\zeta_{13})\) and \(G_f\) is a quotient of \(\Gal\mathbb{Q}(\zeta_{13})/\mathbb{Q})\). Since this is abelian, \(G_f\) is also abelian and must be isomorphic to \(\mathbb{Z}/3\).
8.3 Galois Theory for quartic polynomials
Let \(k\) be a field with characteristic not \(2\) or \(3\) and \(K=k(e_1,e_2,e_3,e_4)\subset L=K(x_1,x_2,x_3,x_4)\).
Then \(L\) is the splitting field over \(K\) of the general quartic polynomial \[\begin{align*} f(x)&=x^4-e_1x^3+e_2x^2-e_3x+e_4 \\ &=(x-x_1)(x-x_2)(x-x_3)(x-x_4)\in K[x] \end{align*}\] where
\[\begin{align*} e_1&=x_1+x_2+x_3, \qquad e_2=x_1x_2+x_1x_3+x_1x_4+x_2x_3+x_2x_4+x_3x_4, \\ e_3&=x_1x_2x_3+x_1x_2x_4+x_1x_3x_4+x_2x_3x_4, \qquad e_4=x_1x_2x_3x_4 \end{align*}\] The Galois group \(\Gal(L/K)\) is isomorphic to the symmetric group \(S_4\) and we would like to understand its subgroups and quotients. One can visualise \(S_4\) as the symmetries of a regular tetrahedron with vertices labelled \(\{1,2,3,4\}\).

Consider the three pairs of opposite edges \[P_1=\{(1,2),(3,4)\},\qquad P_2=\{(1,3),(2,4)\},\qquad P_3=\{(1,4),(2,3)\}\] Any permutation of the four vertices will permute these three edge pairs and we obtain a map \[\pi:S_4\longrightarrow S_3.\] Here are some useful facts about \(\pi\):
It is a group homomorphism. (Composing two permutations of the vertices induces the composition of the corresponding two permutations of the edge pairs \(P_i\).)
It is surjective. (It’s not hard to see which vertices to switch to obtain \(P_1\leftrightarrow P_2\), etc.)
It has kernel \(\ker(\pi)=V_4\) consisting of four even permutations (it’s a copy of the Klein 4-group) \[V_4=\left\{\id, (12)(34), (13)(24), (14)(23) \right\}\cong\mathbb{Z}/2\times\mathbb{Z}/2\]
Inside \(S_4\) there is the subgroup \(A_4\) of even permutations (they are the orientation-preserving symmetries of the tetrahedron). The restriction of \(\pi\) gives a surjective homomorphism \(A_4\to A_3\) which also has kernel \(V_4\). Since \(V_4\) is a kernel, that means it is a normal subgroup of \(S_4\) and of \(A_4\). We remark that whilst \(V\) is the only subgroup of \(A_4\) isomorphic to \(\mathbb{Z}/2\times\mathbb{Z}/2\), there are actually four subgroups of \(S_4\) isomorphic to \(\mathbb{Z}/2\times\mathbb{Z}/2\). However, \(V_4\) as defined above is the only normal one. See here for a diagram of all subgroups in \(S_4\).
Combining all this information, there is an increasing sequence of subgroups in \(S_4\) \[\{\id\} \;\subset\; \mathbb{Z}/2 \;\subset\; V_4 \;\subset\; A_4 \;\subset S_4\; \] and \(V_4\cong\mathbb{Z}/2\times\mathbb{Z}/2\), \(A_4/V_4\cong A_3=\mathbb{Z}/3\) and \(S_4/A_4\cong \mathbb{Z}/2\).
Remark. Each successive step \(G_i\subset G_{i+1}\) in this sequence has \(G_i\) normal in \(G_{i+1}\) and \(G_{i+1}/G_i\) is cyclic. This gives the definition of a solvable group which is the key to deciding if a polynomial can be solved in radicals. Essentially, each of the cyclic steps provides a method to produce radicals via Kummer Theory.
Consider the tower \(K=L^{S_4} \;\subset\; L^{V_4} \;\subset\; L\):
We have \(\Gal(L/L^{V_4})=V_4\cong\mathbb{Z}/2\times\mathbb{Z}/2\). Therefore \(L/L^{V_4}\) appears as a biquadratic extension, such as \(\mathbb{Q}(\sqrt{2},\sqrt{3})/\mathbb{Q}\).
The group \(V_4\) is normal in \(S_4\) and so by the Fundamental Theorem, \(\Gal(L^{V_4}/K)\cong S_4/V_4\cong S_3\). Therefore \(L^{V_4}\) appears as the splitting field of a cubic polynomial over \(K\), as we studied in the last section.
Solving quartic equations
Define the three elements \[\begin{align*} \theta_1&=(x_1+x_2-x_3-x_4)/2, \\ \theta_2&=(x_1-x_2+x_3-x_4)/2, \\ \theta_3&=(x_1-x_2-x_3+x_4)/2. \end{align*}\] Then for any \(\sigma\in V_4\), we have \(\sigma(\theta_j)=\pm\theta_j\) for each \(j=1,2,3\). Actually, these \(\theta_j\) arise from Lagrange resolvents for the three quadratic subextensions of \(L^{V_4}\) in \(L\). They “look like” \(\sqrt{2},\sqrt{3},\sqrt{6}\) in \(\mathbb{Q}(\sqrt{2},\sqrt{3})\). The elements \(t_1=\theta_1^2\), \(t_2=\theta_2^2\), \(t_3=\theta_3^2\) are fixed by \(V_4\) and are permuted by the quotient group \(S_4/V_4\cong S_3\). Therefore, they are roots of a cubic polynomial (the cubic resolvent of \(f(x)\)) \[(t-t_1)(t-t_2)(t-t_3)=(t-\theta_1^2)(t-\theta_2^2)(t-\theta_3^2)=t^3+s_1t^2+s_2t+s_3 \] with coefficients \(s_1,s_2,s_3\in (L^{V_4})^{S_3}=L^{S_4}=K\). These coefficients can now be expressed in terms of elementary symmetric polynomials, using the work from Section 8.1. \[\begin{align*} s_1&=-(\theta_1^2+\theta_2^2+\theta_3^2)=-(3e_1^2-8e_2)/4 \\ s_2&=\theta_1^2\,\theta_2^2+\theta_2^2\,\theta_3^2+\theta_3^2\,\theta_1^2 =(3e_1^4-16e_1^2e_2+16e_1e_3+16e_2^2-64e_4)/16 \\ s_3&=-(\theta_1\theta_2\theta_3)^2=-\left((e_1^3-4e_1e_2+8e_3)/8\right)^2 \end{align*}\]
We can now find the roots \(x_1\), \(x_2\), \(x_3\), \(x_4\) of \(f(x)\) as follows:
Step 1: Solve the cubic resolvent to find \(t_1\), \(t_2\), \(t_3\).
Step 2: Set \(\theta_j=\pm\sqrt{\,t_j}\) where we choose the signs so that \(\theta_1\theta_2\theta_3=(e_1^3-4e_1e_2+8e_3)/8\).
Step 3: Finally, solve the following system of four linear equations \[ \begin{cases} x_1+x_2+x_3+x_4 = \,e_1 \\ x_1+x_2-x_3-x_4 = 2\theta_1 \\ x_1-x_2+x_3-x_4 = 2\theta_2 \\ x_1-x_2-x_3+x_4 = 2\theta_3 \\ \end{cases} \quad\implies\quad \begin{cases} x_1 = e_1/4+\hspace{0.8em}(\theta_1+\theta_2+\theta_3\;)\,/\,2 \\ x_2 = e_1/4+\hspace{0.8em}(\theta_1-\theta_2-\theta_3\;)\,/\,2 \\ x_3 = e_1/4+(-\theta_1+\theta_2-\theta_3\;)\,/\,2 \\ x_4 = e_1/4+(-\theta_1-\theta_2+\theta_3\;)\,/\,2 \end{cases} \]
Remark. In practice, when finding roots of a general quartic \(f(x)=x^4+ax^3+bx^2+cx+d\), one doesn’t use the above formulas directly. Instead, first perform a shift to kill the coefficient of \(x^3\): \[\begin{align*} f(x-a/4)&=(x-a/4)^3+a(x-a/4)^3+b(x-a/4)^2+c(x-a/4)+d \\ &=x^4+px^2+qx+r. \end{align*}\] For this reduced quartic \(x^4+px^2+qx+r\), the formulas simplify since \(e_1=0\), \(e_2=p\), \(e_3=-q\), \(e_4=r\):
The cubic resolvent is \(t^3+2pt^2+(p^2-4r)t-q^2\).
The roots of this are \(t_1=\theta_1^2\), \(t_2=\theta_2^2\), \(t_3=\theta_3^2\) and we choose square roots so that \(\theta_1\theta_2\theta_3=-q\).
The roots of \(f(x-a/4)\) are \[\begin{align*} x_1&=\hspace{0.8em}(\theta_1+\theta_2+\theta_3)/2, \\ x_2&=\hspace{0.8em}(\theta_1-\theta_2-\theta_3)/2, \\ x_3&=(-\theta_1+\theta_2-\theta_3)/2, \\ x_4&=(-\theta_1-\theta_2+\theta_3)/2 \end{align*}\] and we recover the roots of \(f(x)\) by subtracting \(a/4\).
Example 8.2 Find all complex roots of the polynomial \(f(x)=x^4+6x^3+18x^2+30x+25\).
First eliminate the \(x^3\) term: \[\begin{align*} f\left(x-\frac{3}{2}\right) &=\left(x^4-6x^3+\frac{27}{2}x^2-\frac{27}{2}x+\frac{81}{16}\right) +6\left(x^3-\frac{9}{2}x^2+\frac{27}{4}x-\frac{27}{8}\right) \\ &\qquad\qquad\qquad +18\left(x^2-3x+\frac{9}{4}\right)+30\left(x-\frac{3}{2}\right)+25 \\ &=x^4+\frac{9}{2}x^2+3x+\frac{85}{16}. \end{align*}\] This is a reduced quartic \(x^4+px^2+qx+r\) with \(p=9/2\), \(q=3\), \(r=85/16\) and has cubic resolvent \[t^3+2pt^2+(p^2-4r)t-q^2=t^3+9t^2-t-9.\] One could find roots of this using the general method, but it’s easy to see here that \(t_1=1\) is a root and \[t^3+9t^2-t-9=(t-1)(t^2+10t+9)=(t-1)(t+1)(t+9)\] so the other roots are \(t_2=-1\), \(t_3=-9\). Set \(\theta_1=\sqrt{t_1}=1\), \(\theta_2=\sqrt{t_2}=i\) and \(\theta_3=\pm\sqrt{t_3}=\pm 3i\) so that \(\theta_1\theta_2\theta_3=-q=-3\), i.e. take \(\theta_3=+3i\). The roots of \(f(x-3/2)\) are thus \[\begin{align*} x_1&=\hspace{0.8em}(\theta_1+\theta_2+\theta_3)/2=\hspace{0.8em}(1+4i)/2, \\ x_2&=\hspace{0.8em}(\theta_1-\theta_2-\theta_3)/2=\hspace{0.8em}(1-4i)/2, \\ x_3&=(-\theta_1+\theta_2-\theta_3)/2=(-1-2i)/2, \\ x_4&=(-\theta_1-\theta_2+\theta_3)/2=(-1+2i)/2 \end{align*}\] and so \(f(x)\) has roots \(-1\pm 2i\), \(-2\pm i\). Notice that these are all in \(\mathbb{Q}(i)\) and in fact \(f(x)\) is reducible in \(\mathbb{Q}[x]\) \[f(x)=x^4+6x^3+18x^2+30x+25=(x^2+2x+5)(x^2+4x+5).\]
Galois groups of quartic polynomials
Let \(K\) be a field with characteristic not \(2\) or \(3\) and suppose \[f(x)=x^4+ax^3+bx^2+cx+d\in K[x].\] The Galois group is \(G_f=\Gal(L/K)\) where \(L\) is a splitting field for \(f(x)\) over \(K\) and this is a subgroup of \(S_4\). Now \(f(x)\) might be reducible, in which case we find \(G_f\) using our previous work in lower degrees.
So assume we have checked \(f(x)\) is irreducible in \(K[x]\). We’ll find that there are five possible isomorphism classes of Galois groups: \(G_f\) is isomorphic to \(S_4\), \(A_4\), \(V_4\), \(\mathbb{Z}/4\) or the dihedral group \(D_4\) of order \(8\).
Let \(R(t)\in K[t]\) be the cubic resolvent of \(f(x)\), which as above, has roots \(t_1=\theta_1^2\), \(t_2=\theta_2^2\) and \(t_3=\theta_3^2\). Let \(M\) be a splitting field of \(R(t)\) over \(K\), so that \[K\;\subset\; K(t_1,t_2,t_3)\;\subset\;M\;\subset\; L=M(\theta_1,\theta_2,\theta_3).\] Now \(R(t)\) may or may not be irreducible in \(K[t]\). If it is irreducible, then the Galois Group of \(R(t)\), i.e. \(G_R=\Gal(M/K)\), is isomorphic to either \(S_3\) or \(A_3\) by our work on cubic polynomials. In fact, by Theorem 8.2, these cases are distinguished by the discriminant \(\Delta_R\) of \(R(t)\).
Theorem 8.4 Suppose that irreducible \(f(x)\in K[x]\) has irreducible cubic resolvent \(R(t)\in K[t]\). Let \(L\) be the splitting of \(f(x)\) over \(K\) (so that \(G_f=\Gal(L/K)\)) and let \(M\) be the splitting field of \(R(t)\) over \(K\) (so that \(G_R=\Gal(M/K)\)).
\(\;\;(i)\) When \(\Delta_R\in {K^\times}^2\), (so \(G_R\cong A_3\) and \([M:K]=3\)), we have \(G_f\cong A_4\).
\(\;(ii)\) When \(\Delta_R\in K^\times\setminus{K^\times}^2\), (so \(G_R\cong S_3\) and \([M:K]=6\)), we have \(G_f\cong S_4\).
Proof. It will be sufficient to prove that \([L:M]=4\) since then \([L:K]=12\) or \(24\) by the Tower Law.
We claim \(M\) does not contain any of \(\theta_1,\theta_2,\theta_3\):
- Suppose on the contrary that it does and, say, \(\theta_1\in M\). Since \(\Gal(M/K)\) is \(A_3\) or \(S_3\), there must be an order \(3\) element \(\sigma\in\Gal(M/K)\). Now \(\sigma(\theta_1)\) and \(\sigma^2(\theta_1)\) must be the other two roots \(\theta_2,\theta_3\) since \(R(t)\) is irreducible and we have all \(\theta_1,\theta_2,\theta_3\in M\). But then \(M=L\) and \([L:K]=3\) or \(6\). This is a contradiction since \(L\) contains the roots of the irreducible quartic \(f(x)\) so \(4\mid [L:K]\).
Now \(M(\theta_1)/M\) is degree \(2\) and we next prove \(\theta_2\not\in M(\theta_1)\). We must have \(\Gal(M(\theta_1)/M)=\{\id,\tau\}\) for some automorphism \(\tau\) where \(\tau(\theta_1)=-\theta_1\). Furthermore, \(\theta_2^2\in M\) so \(\tau(\theta_2)=\pm\theta_2\).
If \(\tau(\theta_2)=+\theta_2\), then \(\theta_2\in M\) contradicting the above claim.
If \(\tau(\theta_2)=-\theta_2\), then \(\tau(\theta_1\theta_2)=(-\theta_1)(-\theta_2)=\theta_1\theta_2\) and so \(\theta_1\theta_2\in M\). However, \(\theta_1\theta_2\theta_3\in K\subset M\) via the expression \(\theta_1\theta_2\theta_3=(e_1^3-4e_1e_2+8e_3)/8\) and \(\theta_1\theta_2\neq 0\) since \(R(t)\) is irreducible. This implies \(\theta_3\in M\), again contradicting the above claim.
Hence \([M(\theta_1,\theta_2):M]\geq 4\) and since \(\theta_1\theta_2\theta_3\in M\), we conclude \(L=M(\theta_1,\theta_2)\) and \([L:M]=4\).
Example 8.3 The polynomial \(f(x)=x^4+3x^2-7x+4\) is irreducible over \(\mathbb{Q}\). (Check it has no roots in \(\mathbb{Q}\) and no quadratic factors.) The cubic resolvent \(R(t)=t^3+6t^2-7t-49\) is also irreducible over \(\mathbb{Q}\) and has discriminant \(\Delta_R=133^2\in{\mathbb{Q}^\times}^2\). (To speed up the calculation, \(\Delta_R\) is the same as the discriminant of \(R(t-2)=t^3-19t-19\).) The Theorem implies that \(G_f\cong A_4\).
We are left with the trickiest case where \(f(x)\in K[x]\) is irreducible but the cubic resolvent \(R(t)\in K[t]\) is reducible. First of all, it’s possible that all three roots \(t_1=\theta_1^2\), \(t_2=\theta_2^2\) and \(t_3=\theta_3^2\) of \(R(t)\) are in \(K\). Then \(M=K\) and \(L=K(\theta_1,\theta_2,\theta_3)\). Using \(\theta_1\theta_2\theta_3\in K\), we see that \(L/K\) is actually obtained by adjoining just two square roots to \(K\). Furthermore, since \(f(x)\) is irreducible of degree \(4\), we must have \([L:K]\geq 4\) and the only option is a biquadratic extension \[G_f=\Gal(L/K)=V_4\cong\mathbb{Z}/2\times\mathbb{Z}/2.\]
The final possibility is that exactly one of the roots \(t_1,t_2,t_3\) is in \(K\). We’ll rephrase the set-up in the following way.
\(M\) is the splitting field of an irreducible quadratic polynomial over \(K\). Thus \(M=K(\sqrt{d})\) for some \(d\in K^\times\setminus {K^\times}^2\) and \(\Gal(M/K)=\{\id,\varphi\}\cong\mathbb{Z}/2\) where \(\varphi(\sqrt{d})=-\sqrt{d}\).
There are conjugate elements \(\alpha\) and \(\overline{\alpha}=\varphi(\alpha)\) in \(M^\times\setminus{M^\times}^2\) where \(L\) is obtained by adjoining the square roots of these two elements to \(M\): \[K \;\subset\; M=K(\sqrt{d})=K(\alpha,\overline{\alpha}) \;\subset\; L=M\left(\sqrt{\alpha},\sqrt{\,\overline{\alpha}\,}\right)\]
Note that in this general situation, \(L/K\) is a normal extension. Indeed, if \(\alpha,\overline{\alpha}\) are the roots of a quadratic polynomial \(x^2+ax+b\in K[x]\), then \(\pm\sqrt{\alpha},\,\pm\sqrt{\,\overline{\alpha}\,}\) are the roots of \(x^4+ax^2+b\in K[x]\). Therefore \(L\) is a splitting field of \(x^4+ax^2+b\) over \(K\). For the tower of fields above, we have Galois groups \[G=\Gal(L/K) \;\supset\; H=\Gal(L/M) \;\supset\; \{\id\}\] and \(G/H\cong\Gal(M/K)=\{\id,\varphi\}\cong\mathbb{Z}/2\). The structure of \(\Gal(L/K)\) depends on where \(\alpha\overline{\alpha}\) lives.
Theorem 8.5 With the above notation, the Galois group \(G=\Gal(L/K)\) is determined by the following:
\(\;\;\;(i)\;\) If \(\alpha\overline{\alpha}\in {K^\times}^2\), then \([L:K]=4\) and \(G\cong\mathbb{Z}/2\times\mathbb{Z}/2\).
\(\;\;(ii)\;\) If \(\alpha\overline{\alpha}\in {M^\times}^2\setminus {K^\times}^2\), then \([L:K]=4\) and \(G\cong\mathbb{Z}/4\).
\(\;(iii)\;\) If \(\alpha\overline{\alpha}\not\in {M^\times}^2\), then \([L:K]=8\) and \(G\cong D_4\).
Proof. \((i)\) We have \(\alpha\overline{\alpha}=b^2\) where \(b\in K^\times\subset M^\times\). Choose square roots so that \(\sqrt{\,\overline{\alpha}\,}=b/\sqrt{\alpha}\) and then \(L=M(\sqrt{\alpha})\) is degree \(2\) over \(M\). Thus \(|G|=[L:K]=4\) and we can define the four elements of \(G\) explicitly by their distinct actions on \(\sqrt{\alpha}\). Note they all fix \(b\) since \(b\in K\).
\(H=\Gal(L/M)=\{\id,\tau\}\subset G\) where \(\tau(\sqrt{\alpha})=-\sqrt{\alpha}\). Then \(\tau^2=\id\).
Define \(\sigma\in G\) by \(\sigma(\sqrt{\alpha})=-\sqrt{\,\overline{\alpha}\,}=\dfrac{-b}{\sqrt{\alpha}}\). Then \(\sigma^2=\id\) as \[\sigma^2(\sqrt{\alpha})=\dfrac{-b}{\sigma(\sqrt{\alpha})}=\dfrac{-b}{-b/\sqrt{\alpha}}=\sqrt{\alpha}.\]
Also, \(\sigma\tau(\sqrt{\alpha})=b/\sqrt{\alpha}=\tau\sigma(\sqrt{\alpha})\) so \(\sigma\tau=\tau\sigma\).
Hence in this case, we have \(G=\langle\,\sigma,\tau\;\mid\;\sigma^2=\tau^2=\id,\; \sigma\tau=\tau\sigma\,\rangle\cong\mathbb{Z}/2\times\mathbb{Z}/2\).
\((ii)\) Now \(\alpha\overline{\alpha}=b^2\) where \(b\in M^\times\setminus K^\times\) and so \(b=c\sqrt{d}\) for some \(c\in K^\times\). Choose square roots so that \(\sqrt{\,\overline{\alpha}\,}=c\sqrt{d}/\sqrt{\alpha}\) and then \(L=M(\sqrt{\alpha})\) is degree \(2\) over \(M\). Again, \(|G|=[L:K]=4\) but the difference from the previous case is that, since some elements of \(G\) send \(\sqrt{d}\mapsto -\sqrt{d}\), they will also send \(b\mapsto -b\).
Define \(\sigma\in G\) by \(\sigma(\sqrt{d})=-\sqrt{d}\) and \(\sigma(\sqrt{\alpha})=-\sqrt{\,\overline{\alpha}\,}=\dfrac{-c\sqrt{d}}{\sqrt{\alpha}}\). Then \(\sigma^2(\sqrt{d})=\sqrt{d}\) and \[\sigma^2(\sqrt{\alpha})=\dfrac{\sigma(-c\sqrt{d})}{\sigma(\sqrt{\alpha})}=\dfrac{c\sqrt{d}}{-c\sqrt{d}/\sqrt{\alpha}}=-\sqrt{\alpha}.\]
Also, \(\sigma^3(\sqrt{\alpha})=-\sigma(\sqrt{\alpha})=\dfrac{c\sqrt{d}}{\sqrt{\alpha}}=\sqrt{\,\overline{\alpha}\,}\) and \(\sigma^4(\sqrt{\alpha})=\sqrt{\alpha}\) so \(\sigma^4=\id\).
Hence in this case, we have \(G=\{\id,\sigma,\sigma^2,\sigma^3\}\cong\mathbb{Z}/4\). Notice that \(H=\Gal(L/M)=\{\id,\sigma^2\}\subset G\).
\((iii)\) Finally, we have \(\alpha\overline{\alpha}\not\in {M^\times}^2\). Now \(\sqrt{\alpha\overline{\alpha}}\not\in M^\times\) so \(\sqrt{\,\overline{\alpha}\,}\not\in M(\sqrt{\alpha})\) and \(L=M(\sqrt{\alpha},\sqrt{\,\overline\alpha\,})\) is a degree \(4\) extension of \(M\). We can describe \(G\) as follows:
Define \(\sigma_1,\sigma_2\in H=\Gal(L/M)\subset G\) by \(\;\;\;\begin{cases} \sigma_1(\sqrt{\alpha})=-\sqrt{\alpha},\hspace{2em} \sigma_1(\sqrt{\,\overline{\alpha}\,})=\hspace{0.8em}\sqrt{\,\overline{\alpha}\,} \\ \sigma_2(\sqrt{\alpha})=\hspace{0.8em}\sqrt{\alpha},\hspace{2em} \sigma_2(\sqrt{\,\overline{\alpha}\,})=-\sqrt{\,\overline{\alpha}\,} \end{cases}\)
Then \(\sigma_1^2=\sigma_2^2=\id\), \(\sigma_1\sigma_2=\sigma_2\sigma_1\) and \(H=\langle\,\sigma_1,\sigma_2\,\rangle\cong\mathbb{Z}/2\times\mathbb{Z}/2\).
Note that \(\sigma_1(\sqrt{d})=\sqrt{d}\) and \(\sigma_2(\sqrt{d})=\sqrt{d}\) since \(\sqrt{d}\in M\).
Now define \(\tau\in G\) by \(\tau(\sqrt{\alpha})=\sqrt{\,\overline{\alpha}\,}\), \(\tau(\sqrt{\,\overline{\alpha}\,})=\sqrt{\alpha}\) and \(\tau(\sqrt{d})=-\sqrt{d}\). (This is a lifting of the generator of \(\Gal(M/K)\) to \(G\).) Clearly, \(\tau^2=\id\).
Now \(\tau\) doesn’t commute with \(\sigma_1,\sigma_2\) but it’s easy to check that \(\sigma_1\tau=\tau\sigma_2\). (Just compare how the two sides act on \(\sqrt{\alpha}\), \(\sqrt{\,\overline{\alpha}\,}\) and \(\sqrt{d}\)).
We obtain the group presentation \[\begin{align*} G&=\{\id,\sigma_1,\sigma_2,\sigma_1\sigma_2,\tau,\tau\sigma_1,\tau\sigma_2,\tau\sigma_1\sigma_2\} \\ &=\langle\,\sigma_1,\sigma_2,\tau\;\mid\; \sigma_1^2=\sigma_2^2=\tau^2=\id,\; \sigma_1\sigma_2=\sigma_2\sigma_1,\; \tau\sigma_1\tau=\sigma_2\,\rangle. \end{align*}\] Now perhaps we don’t recognise this group and there are two non-abelian groups of order 8, namely \(D_4\) and the Quarternion group \(Q_8\). However, if we set \(\sigma=\tau\sigma_2\), then \[\sigma^2=(\tau\sigma_2)(\tau\sigma_2)=\tau(\sigma_2\tau)\sigma_2=\tau(\tau\sigma_1)\sigma_2=\sigma_1\sigma_2\] which has order \(2\) and so \(\sigma\) has order \(4\). Furthermore, \(\sigma^3=(\tau\sigma_2)(\sigma_1\sigma_2)=\tau\sigma_1=\sigma_2\tau=\tau^{-1}\sigma\tau\). Thus we’ve found that \(G\) is the dihedral group with \(8\) elements \[G=\langle\,\sigma,\tau\;\mid\; \sigma^4=\tau^2=\id,\;\; \tau^{-1}\sigma\tau=\sigma^3\,\rangle\cong\ D_4. \]
Example 8.4 The final case with dihedral group \(D_4\) was the hardest to describe, but we’ve already seen an explicit example in Exercise 4.6(g) which asked us to find the Galois group \(G=\Gal(L/K)\) of the splitting field \(L\) of the quartic polynomial \(f(x)=x^4+x^2-1\) over \(\mathbb{Q}\).
The polynomial \(f(x)\) is irreducible over \(\mathbb{Q}\) and it has roots \(\pm\sqrt{\alpha}\), \(\pm\sqrt{\,\overline{\alpha}\,}\) where \(\alpha, \overline{\alpha}=\dfrac{-1\pm\sqrt{5}}{2}\).
In particular, \(\alpha,\overline{\alpha}\in M=\mathbb{Q}(\sqrt{5})\) so we have \(d=5\) in the above set-up.
Now \(\alpha\overline{\alpha}=-1\) is not in \({M^\times}^2\) since \(i\not\in\mathbb{Q}(\sqrt{5})\) so the above Theorem tells us \(G\cong D_4\).
Look back at the solution to Exercise 4.6(g) and compare the action of \(G\) on \(\pm\sqrt{\alpha}\) and \(\pm\sqrt{\,\overline{\alpha}\,}\) described there with the situation above. Notice how \(L\) contains both quadratic extensions \(\mathbb{Q}(\sqrt{5})\) and \(\mathbb{Q}(i)\) of \(\mathbb{Q}\) and we can use either in describing \(G\).
8.4 A Criterion for solvability by radicals
Galois’ principle motivation for constructing his Theory was to determine whether or not a given polynomial \(f(x)\) can be solved using radical operations. We’ll now see how a group-theoretic property of the Galois group of \(f(x)\) answers this problem. For simplicity, we’ll assume that all fields in this section have characteristic zero. (One can formulate some results more generally, but there are complications due to the lack of non-trivial \(p\)-th roots of unity in characteristic \(p>0\).)
Definition 8.1 A field extension \(L/K\) is called a radical extension if there is a tower of field extensions \[K=K_0 \;\subset\; K_1 \;\subset\; \cdots \;\subset\; K_{m-1} \;\subset\; K_m=L \] where for each \(1\leq i\leq m\), we have \(K_i=K_{i-1}(\alpha_i)\) with \(\alpha^{n_i}\in K_{i-1}\) and \(n_i\in\mathbb{N}\).
In particular, a radical extension of \(K\) is obtained by using finitely many arithmetic operations and radical operations \(\sqrt[n]{\,\cdot\;}\) on the elements of \(K\).
Example 8.5 \((a)\;\) Consider the extension \(L=\mathbb{Q}\left(\sqrt{2+\sqrt{2}}\right)\) over \(K=\mathbb{Q}\). This is radical since \[K=K_0=\mathbb{Q}\;\subset\; K_1=K_0(\alpha_1) \;\subset\; L=K_2=K_1(\alpha_2)\] where \(\alpha_1^2=2\in K_0\) and \(\alpha_2^2=2+\sqrt{2}\in K_1\).
\((b)\;\) Consider a splitting field \(L\) of \(f(x)=x^3+x^2-4x+1\) over \(\mathbb{Q}\). Via the procedure in Section 8.2, we can solve this using radicals. As usual, let \(\omega=e^{2\pi i/3}=(-1+\sqrt{-3})/2\)
The quadratic resolvent is \(t^2+(65/27)t+(13/9)^3\) and has roots \[t_1,t_2=\dfrac{13}{2\cdot 3^3}(-5\pm 3\sqrt{-3})=\dfrac{1}{3^3}(-13\pm 39\omega).\]
We choose cube roots \(\theta_1=\sqrt[3]{t_1}\) and \(\theta_2=\sqrt[3]{t_2}\) so that \(\theta_1\theta_2=13/9\).
The roots of \(f(x)\) are \(-1/3+\theta_1+\theta_2\), \(-1/3+\omega^2\theta_1+\omega\theta_2\), \(-1/3+\omega\theta_1+\omega^2\theta_2\).
Interestingly, these three numbers are actually real-valued and so \(L\subset\mathbb{R}\). One way to see this is to note \(f(-3)<0\), \(f(-1)>0\), \(f(1)<0\), \(f(2)>0\). Then, by the Intermediate Value Theorem, \(f(x)\) has at least three real roots. Even though all three roots of \(f(x)\) can be expressed in terms of radicals and lie in \(L\), the extension \(L/\mathbb{Q}\) is not a radical extension! Indeed, suppose it was.
In Example 8.1, we found that \(f(x)\) is irreducible in \(\mathbb{Q}[x]\) and has discriminant \(\Delta=13^2\in{\mathbb{Q}^\times}^2\). This implies that \(G_f=\Gal(L/\mathbb{Q})\cong A_3=\mathbb{Z}/3\) and \([L:\mathbb{Q}]=3\).
By considering the degrees when adjoining roots in forming a radical extension, we must have \(L=\mathbb{Q}(\alpha)\) for some \(\alpha\) with \(\alpha^m\in\mathbb{Q}\) and \(m\geq 3\). Since \([L:\mathbb{Q}]=3\), the minimal polynomial of \(\alpha\) has degree \(3\) and must divide \(x^m-\alpha^m\). Furthermore, \(L/\mathbb{Q}\) is Galois so this minimal polynomial splits completely in \(L\). Its three roots are of the form \(\alpha, \zeta_m\alpha,...,\zeta_m^{m-1}\alpha\). But this is impossible since it contradicts \(L\subset\mathbb{R}\).
Thus \(L/\mathbb{Q}\) is not a radical extension. On the other hand, the roots of \(f(x)\) (and hence \(L\)) are contained in a radical extension of \(\mathbb{Q}\). In fact, using the above calculation, they lie in the top of the tower \[\mathbb{Q} \;\subset\; \mathbb{Q}(\omega) \;\subset\; \mathbb{Q}(\omega)(\sqrt[3]{-13+39\omega}) \;\subset\; \mathbb{Q}(\omega)(\sqrt[3]{-13+39\omega})(\sqrt[3]{-13-39\omega}).\]
Remark. This is an example of the Casus Irreducibilis - there are polynomials with only real roots that can be expressed using radicals, but only by introducing complex numbers. Understanding this strange phenomenon was a significant factor in the early development and usage of complex numbers (which Cardano described as “mental torture”!)
The above examples motivate the precise definition of solvability in radicals.
Definition 8.2 Suppose \(f(x)\) is an irreducible polynomial in \(K[x]\). Then we say \(f(x)\) is solvable in radicals over \(K\) if there is a radical extension \(L\) of \(K\) containing at least one root of \(f(x)\).
Now it isn’t immediately obvious from this definition that if \(f(x)\in K[x]\) is solvable in radicals over \(K\), then all of its roots belong to a radical extension of \(K\). However, this follows from the next result, which also allows us to work with radical extensions which are Galois.
Lemma 8.1 Let \(L/K\) be a radical extension with normal closure \(N/K\). Then \(N/K\) is also a radical extension, and in particular, any radical extension is contained in a Galois radical extension.
Proof. Given a radical extension \(L/K\), we have subfields \[K=K_0 \;\subset\; K_1 \;\subset\; \cdots \;\subset\; K_{m-1} \;\subset\; K_m=L \] where \(K_i=K_{i-1}(\alpha_i)\) with \(\alpha^{n_i}\in K_{i-1}\). Let \(f_i(x)\in K[x]\) be the minimal polynomial of \(\alpha_i\) over \(K\). Then the normal closure \(N\) is the splitting field of the product \(f_1(x)\cdots f_m(x)\) over \(K\), and \(N\) contains all roots of each \(f_i(x)\).
We prove that \(N/K\) is radical by induction on \(m\). If \(m=0\), then \(K=L=N\) which is trivially radical. Suppose \(m\geq 1\) and let \(M\) be the splitting field of \(f_1(x)\cdots f_{m-1}(x)\) over \(K\). Then \(M/K\) is Galois and \(M=K(\beta_1,...,\beta_r)\) where the \(\beta_j\) are all the roots of \(f_1(x),...,f_{m-1}(x)\). These roots include \(\alpha_1,...,\alpha_{m-1}\) so we have \(K_{m-1}\subset M\). By the inductive hypothesis, \(M/K\) is a radical extension. It should be clear from the definition that \[\text{$N/M$ and $M/K$ both radical}\qquad\implies\qquad \text{$N/K$ is radical}\] by writing one tower of extensions after the other. So it remains to prove that \(N/M\) is radical.
Now \(N\) is obtained by adjoining all the roots of \(f_m(x)\) to \(M\), including \(\alpha_m\). Let \(\gamma\) be any of these roots. Since \(\alpha_m\) and \(\gamma\) share the same minimal polynomial over \(K\), there is \(\sigma\in\Gal(N/K)\) such that \(\sigma(\alpha_m)=\gamma\) and hence \[\gamma^{n_m}=\sigma(\alpha_m)^{n_m}=\sigma(\alpha_m^{n_m}).\] However, \(\alpha_m^{n_m}\in K_{m-1}\subset M\) and the intermediate extension \(M/K\) is Galois so the Fundamental Theorem implies \(\sigma(M)=M\). Hence \[\gamma^{n_m}=\sigma(\alpha_m^{n_m})=\alpha_m^{n_m}\in M\] and it follows that \(N\) is a radical extension of \(M\) as required.
In particular, if \(f(x)\) has one root in \(L\), then it has all of its roots in the Galois extension \(N/K\) which is also radical.
We now turn to the group-theoretic side of the set-up with a corresponding definition of solvable group.
Definition 8.3 A finite group \(G\) is solvable if there exist a decreasing sequence of subgroups \[G=G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_{m-1} \;\supset\; G_m=\{\id\}\] where for each \(1\leq i\leq m\), the group \(G_{i}\) is a normal subgroup of \(G_{i-1}\) and \(G_{i-1}/G_{i}\) is cyclic.
Remark. In the literature, one sometimes sees the condition “\(G_{i-1}/G_i\) cyclic” replaced by “\(G_{i-1}/G_i\) abelian” or “\(G_{i-1}/G_i\) cyclic of prime order”. These in fact give equivalent definitions. Using the classification of finite abelian groups, one can essentially refine a sequence with \(G_{i-1}/G\) abelian to one with \(G_{i-1}/G_i\) cyclic and then refine again to give one with \(G_{i-1}/G_i\) cyclic of prime order.
Here are some fundamental properties of solvable groups, showing they work nicely with subgroups and quotients.
Lemma 8.2 \(\;\;\;(i)\;\) Every subgroup \(H\) of a finite solvable group \(G\) is solvable.
\(\;\;(ii)\;\) Let \(G\) be a finite group with normal subgroup \(H\). Then \(G\) is solvable if and only if both \(H\) and \(G/H\) are solvable.
Proof. \((i)\;\) Given a solvable group \(G\) as in the definition and \(H\subset G\), define \(H_i=G_i\cap H\). Then \[H=H_0 \;\supset\; H_1 \;\supset\; \cdots \;\supset\; H_{m-1} \;\supset\; H_m=\{\id\}\] and we claim \(H_i\) is normal in \(H_{i-1}\) and \(H_{i-1}/H_i\) is cyclic. Consider the group homomorphism \[\begin{align*} \pi: H_{i-1} &\longrightarrow G_{i-1}/G_i \\ h &\longmapsto hG_i. \end{align*}\] Observe that \(h\in\ker(\pi)\) exactly when \(hG_i=G_i\), i.e. \(h\in G_i\) and thus when \[h\in H_{i-1}\cap G_i=(G_{i-1}\cap H)\cap G_i=H\cap G_i=H_i.\] That means \(\ker(\pi)=H_i\) is a kernel and hence is a normal subgroup of \(H_{i-1}\). Furthermore, by the First Isomorphism Theorem for groups, \[H_{i-1}/H_i=H_{i-1}/\ker(\pi)\cong \im(\pi)\subset G_{i-1}/G_i.\] Thus \(H_{i-1}/H_i\) is a subgroup of a cyclic group and is cyclic.
\((ii)\;\) Some of this might require a bit more group theory than we’re comfortable with so we’ll leave some gaps to fill. Given a normal subgroup \(H\) in \(G\), we use the canonical projection \(\pi:G\to G/H\) onto the quotient \(Q=G/H\).
For the if part, suppose the the quotient \(Q=G/H\) is solvable and we have a sequence \[Q=Q_0 \;\supset\; Q_1 \;\supset\; \cdots \;\supset\; Q_{n-1} \;\supset\; Q_n=\{\id\}\] with \(Q_i\) normal in \(Q_{i-1}\) and \(Q_{i-1}/Q_i\) cyclic. Consider the inverse image \(G_i=\pi^{-1}(Q_i)\) \[G_i=\pi^{-1}(Q_i)=\{g\in G \;\mid\; \pi(g)\in Q_i\}.\] This gives a sequence of subgroups of \(G\) where, in particular, \(G_0=G\) and \(G_n=\pi^{-1}(Q_n)=\pi^{-1}(\{\id\})=H\). \[G=G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_{n-1} \;\supset\; G_n=H\] One can now show that \(G_{i}\) is a normal subgroup of \(G_{i-1}\) and that there is an isomorphism \[\begin{align*} G_{i-1}/G_i\;\;\tilde{\longrightarrow}\;\; Q_{i-1}/Q_i \\ gG_i\;\;\longmapsto \;\; \pi(g)Q_i. \end{align*}\] If \(H\) is also solvable, then there is a decreasing sequence \[H=H_0 \;\supset\; H_1 \;\supset\; \cdots \;\supset\; H_{m-1} \;\supset\; H_m=\{\id\}\] with \(H_i\) normal in \(H_{i-1}\) and \(H_{i-1}/H_i\) cyclic. Gluing these two sequences together shows that \(G\) is solvable.
For the only if part, we need to show that if \(G\) is solvable then so is \(Q=G/H\). Given the sequence for \(G\), \[G=G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_{m-1} \;\supset\; G_m=\{\id\}\] define \(Q_i=\pi(G_i)\). One can show that \(Q_i\cong G_i/(G_i\cap H)\) and that \(Q_{i-1}\) is a normal subgroup in \(Q_i\). Furthermore, \(Q_{i-1}/Q_i\) is a quotient of \(G_{i-1}/G_i\) and quotients of cyclic groups are cyclic.
Example 8.6 \((a)\;\) Any finite abelian group \(G\) is solvable. This can be shown easily using the fact that a finite abelian group is the direct product of cyclic groups. Alternatively, it can be proved by induction using the above Lemma as follows. If \(|G|>1\), then any non-identity element generates a cyclic subgroup \(H\) which is necessarily normal as \(G\) is abelian. Then \(H\) is clearly solvable and \(G/H\) is solvable by the induction hypothesis.
\((b)\;\) The dihedral group \(D_n\) (the symmetries of the regular \(n\)-sided polygon) is solvable for all \(n\). Indeed, \(D_n\) contains a normal subgroup \(H\cong\mathbb{Z}/n\) (the subgroup of rotations) \[D_n \;\supset\; H\cong\mathbb{Z}/n \;\supset\; \{\id\}\] and \(D_n/H\cong\mathbb{Z}/2\) is cyclic.
\((c)\;\) The groups \(S_3\) and \(S_4\) are solvable (and we’ve already seen them in action). For \(S_3\cong D_3\) it is as in the previous example. For \(S_4\), when finding the Galois groups of quartic polynomials, we saw the sequence \[S_4 \;\supset\; A_4 \;\supset\; V_4\cong\mathbb{Z}/2\times\mathbb{Z}/2 \;\supset\; \mathbb{Z}/2 \;\supset\; \{\id\}\] where \(V_4/(\mathbb{Z}/2)\cong\mathbb{Z}/2\), \(A_4/V_4\cong A_3=\mathbb{Z}/3\) and \(S_4/A_4\cong \mathbb{Z}/2\). On the other hand, we’ll see in the next section that \(S_n\) and \(A_n\) are not solvable when \(n\geq 5\).
\((d)\) Burnside’s Theorem states says that every group of order \(p^aq^b\) where \(p\) and \(q\) are prime is solvable. You may possibly see a proof next year in Representation Theory IV.
\((e)\) The Feit-Thompson Theorem says that every finite group of odd order is solvable This remarkably simple to state result is immensely difficult and took 255 pages to prove.
We can now finally state Galois’ great achievement of finding a necessary and sufficient condition for a polynomial to be solvable by radicals.
Theorem 8.6 (Galois' Theorem) Let \(f(x)\in K[x]\) be an irreducible polynomial over a field \(K\) of characteristic zero. Then \(f(x)\) is solvable in radicals over \(K\) if and only if the Galois group \(G_f\) is solvable.
Proof. If we assumed \(K\) contained sufficiently many roots of unity and that adjoining various roots gave Galois extensions, this would be relatively easy. The Galois correspondence would translate between successive steps in a radical extension and cyclic steps in the sequence of subgroups via Kummer Theory. To handle the general case involves enlarging fields to make Galois extensions with extra roots of unity.
Suppose \(f(x)\) is solvable in radicals over \(K\) so that it has all of its roots in a splitting field \(L/K\) which is a Galois radical extension. We want to show that \(G_f=\Gal(L/K)\) is a solvable group. There is a tower \[K=K_0 \;\subset\; K_1=K_0(\alpha_1) \;\subset\; \cdots \;\subset\; K_m=K_{m-1}(\alpha_m)=L\] where \(\alpha^{n_i}\in K_{i-1}\). Let \(n\) be a multiple of each \(n_i\), write \(\zeta\) for a primitive \(n\)-th root of unity and consider the extension \(L(\zeta)/K\). It is a splitting field of \((x^n-1)f(x)\) over \(K\). Defining \(M=M_0=K(\zeta)\) and \(M_i=M_{i-1}(\alpha_i)\), we have a tower \[K\subset M=M_0=K(\zeta) \;\subset\; M_1=M_0(\alpha_1) \;\subset\; \cdots \;\subset\; M_m=M_{m-1}(\alpha_m)=L(\zeta).\] The extension \(M/K\) is a cyclotomic extension so has abelian Galois group. Furthermore, each \(M_{i-1}\) contains a primitive \(n_i\)-th root of unity so by Kummer Theory, the extensions \(M_i/M_{i-1}\) are Galois with cyclic Galois group. By the Galois correspondence, we obtain a sequence of subgroups of \(G=\Gal(L(\zeta)/K)\) \[G \;\supset\; G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_{m-1} \;\supset\; G_m=\{\id\}. \] Here \(G/G_0=\Gal(M/K)\) is abelian and each \(G_{i-1}/G_i=\Gal(M_i/M_{i-1})\) is cyclic, so \(G=\Gal(L(\zeta)/K)\) is a solvable group. Finally, \(G_f=\Gal(L/K)\) is a quotient of \(G=\Gal(L(\zeta)/K)\) and so \(G_f\) is also a solvable group.
Conversely, let \(L\) be a splitting field of \(f(x)\) over \(K\) and suppose that \(G=G_f=\Gal(L/K)\) is a solvable group. We want to show that the roots of \(f(x)\) lie in a radical extension of \(K\). There is a decreasing sequence of subgroups \[G=G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_{m-1} \;\supset\; G_m=\{\id\}\] with \(G_i\) normal in \(G_{i-1}\) and \(G_{i-1}/G_i\) cyclic. By the Galois correspondence, there is a tower of fixed fields \[K=K_0=L^{G_0} \;\subset\; K_1=L^{G_1} \;\subset\; \cdots \;\subset\; K_{m-1}=L^{G_{m-1}} \;\subset\; K_m=L.\] Each \(K_i/K_{i-1}\) is a cyclic extension and \(\Gal(K_i/K_{i-1})=G_{i-1}/G_i)\) is a cyclic group.
Let \(n=[L:K]=|G|\), write \(\zeta\) for a primitive \(n\)-th root of unity and define \(M_i=K_i(\zeta)\). In particular, \(M_m=L(\zeta)\) is a splitting field of \((x^n-1)f(x)\) over \(K\). For each \(i\), the group \(H_i=\Gal(M_m/M_i)\) is a subgroup of \(H=\Gal(M_m/K)\). Now \(M_m\) is a splitting field over \(M_i\) and \(\Gal(M_i/M_{i-1})\cong H_{i-1}/H_i\) is isomorphic to a subgroup of \(\Gal(K_i/K_{i-1})\cong G_{i-1}/G_i\) and is hence cyclic. We therefore have a sequence of subgroups \[\begin{align*} H_0&=\Gal(M_m/M_0)=\Gal(L(\zeta)/K(\zeta)) \\ &\;\supset\; H_1=\Gal(M_m/M_1) \;\supset\; \cdots \;\supset\; H_{m-1}=\Gal(M_m/M_{m-1})\;\supset\; H_m=\{\id\} \end{align*}\] with each \(H_i\) normal in \(H_{i-1}\) and \(H_{i-1}/H_i\) cyclic. Now this sequence doesn’t start with \(H=\Gal(M_m/K)\). However, \(H_0=\Gal(M_m/M_0)\) is normal in \(H\) since \(M_0=K(\zeta)\) is a splitting field over \(K\) and moreover, \(H/H_0\) is abelian. We see \(M_0/K\) is clearly obtained by adjoining a root of an element of \(K\) and Kummer Theory gives a tower of the form \[K\;\subset M_0=K(\zeta) \;\subset\; M_1=M_0(\alpha_1)\;\subset\;\cdots \;\subset\; M_m=M_{m-1}(\alpha_m)=L(\zeta)\] where \(\alpha^{n_i}\in M_{i-1}\) for some \(n_i\). Finally, the roots of \(f(x)\) lie in this radical extension \(L(\zeta)/K\) and we are done.
8.5 Polynomials not solvable by radicals
By the criterion in the last section, a polynomial \(f(x)\in K[x]\) is solvable in radicals over \(K\) precisely when its Galois group \(G_f\) is solvable. Thus to find non-solvable polynomials, we need to know some non-solvable groups. We will show that \(S_n\) and \(A_n\) are not solvable when \(n\geq 5\). First we recall a simple result about \(A_n\) from Algebra II.
Lemma 8.3 The alternating group \(A_n\) is generated by the set of \(3\)-cycles \((ijk)\).
Proof. First, \(A_1=A_2\) is the trivial group so there’s nothing to prove. For \(n\geq 3\), any element in \(A_n\) is a product of an even number of transpositions \[(i_1j_1)(i_2j_2)\cdots(i_{2r}j_{2r}).\] We can then combine pairs of transpositions from left to right as follows:
cancel a repeated pair \((ij)(ij)=\id\),
write a non-disjoint pair as \((ij)(ik)=(ikj)\),
write a disjoint pair as \((ij)(kl)=(ik)(jk)(jk)(kl)=(ijk)(jkl)\).
\(\;\)
We can now prove the result we need.
Theorem 8.7 For \(n\geq 5\), the alternating group \(A_n\) and symmetric group \(S_n\) are not solvable.
Proof. If we can show that \(A_n\) is not solvable, then Lemma 8.2 \((i)\) implies \(S_n\) is not solvable either.
Suppose on the contrary that \(A_n\) is solvable, and so there is a decreasing sequence of subgroups \[A_n=G_0 \;\supset\; G_1 \;\supset\; \cdots \;\supset\; G_m=\{\id\}\] with \(G_i\) normal in \(G_{i-1}\) and \(G_{i-1}/G_i\) cyclic. In particular, we have the canonical projection \[ \pi:A_n\longrightarrow Q=A_n/G_1\] where the quotient \(Q\) is abelian. We can assume this group \(Q\) is non-trivial, since otherwise \(A_n=G_0=G_1\). We will show that such a homomorphism \(\pi\) is actually the trivial map \(\pi(g)=\id\) for each \(g\in A_n\), which contradicts the fact that \(\pi\) is surjective.
Choose an arbitrary \(3\)-cycle \(g=(i_1i_2i_3)\) in \(A_n\). We can choose \(i_4,i_5\) so that \(i_1,i_2,i_3,i_4,i_5\) are distinct elements of \(\{1,...,n\}\). (It is here where we require \(n\geq 5\).) Now define \(g_1=(i_1i_2i_4)\) and \(g_2=(i_1i_3i_5)\) and compute \[\begin{align*} g_1g_2g_1^{-1}g_2^{-1} &= (i_1i_2i_4)(i_1i_3i_5)(i_1i_4i_2)(i_1i_5i_3) \\ &=(i_1i_2i_3)=g. \end{align*}\] Applying \(\pi\) to this, using the fact that \(Q=\pi(A_n)\) is abelian, we find \[\pi(g)=\pi(g_1)\pi(g_2)\pi(g_1)^{-1}\pi(g_2)^{-1}=\id.\] That means \(\pi\) sends \(3\)-cycles to the identity, and since \(A_n\) is generated by \(3\)-cycles, the result follows.
Let \(L=k(x_1,...,x_n)\) be the field of rational expressions in independent indeterminates \(x_1,...,x_n\) and coefficients in a field \(k\) (of characteristic zero). Then the general monic polynomial of degree \(n\) with roots \(x_1,...,x_n\) is \[f(x)=(x-x_1)(x-x_2)\cdots(x-x_n)=x^n-e_1x^{n-1}+e_2x^{n-2}+\cdots+(-1)^ne_n\] where \(e_1,...,e_n\) are the symmetric polynomials in \(x_1,...,x_n\). The coefficients of \(f(x)\) lie in \(K=k(e_1,...,e_n)\) and we proved at the start of this chapter that \(\Gal(L/K)=S_n\). We conclude that \(f(x)\) is not solvable by radicals over \(K\) when \(n\geq 5\), i.e. it is impossible to find formulas for the roots \(x_1,...,x_n\) in radicals from the coefficients of \(f(x)\).
Now the above shows that there is no general solution for quintic polynomials. However, that doesn’t mean quintics can never be solved by radicals. For instance, reducible ones certainly can be and even some irreducible ones can be too. An example from Stewart’s book is \(x^5+15x+12\). This polynomial is irreducible over \(\mathbb{Q}\) (by Eisenstein with \(p=3\)) and one can show its Galois group is a solvable subgroup of order \(20\) inside \(S_5\). It’s thus possible to solve it by radicals, with one root being \[\sqrt[5]{\frac{-75+21\sqrt{10}}{125}}+\sqrt[5]{\frac{-75-21\sqrt{10}}{125}}+\sqrt[5]{\frac{225+72\sqrt{10}}{125}}+\sqrt[5]{\frac{225-72\sqrt{10}}{125}}\] and similar expressions for the other four roots. This specific group of order \(20\) together with \(\mathbb{Z}/5\) and \(D_5\) are the only possibilities for an irreducible quintic polynomial with solvable Galois group and they occur very rarely. If you pick an irreducible quintic polynomial in \(\mathbb{Z}[x]\) at “random” in a suitable sense, it will have Galois group \(S_5\) with probability \(1\).
It would be nice to end with an explicit example of a polynomial in \(\mathbb{Q}[x]\) which can’t be solved by radicals. We can adapt the above ideas to do this:
Theorem 8.8 Let \(f(x)\in\mathbb{Q}[x]\) be an irreducible polynomial of degree \(5\) with exactly three real roots. Then \(f(x)\) has Galois group \(G_f\cong S_5\).
Proof. Let \(L\) be the splitting field of \(f(x)\) over \(\mathbb{Q}\). Then \(G_f=\Gal(L/K)\) acts by permuting the five distinct roots of \(f(x)\) so can be considered as a subgroup of \(S_5\). Given that three roots are real, the other two are a complex conjugate pair. In particular, complex conjugate restricts to \(L\) to give a non-trivial automorphism of order \(2\), i.e. \(G_f\subset S_5\) contains a transposition. Now \(f(x)\) is irreducible so its degree \(\deg f(x)=5\) divides \(|G_f|\) via the Tower law. Recall Cauchy’s Theorem from Algebra II:
\(\;\;\;\;\) Suppose \(G\) is a finite group and prime \(p\) divides \(|G|\). Then there is an element of order \(p\) in \(G\).
In particular, since \(5\) divides \(|G_f|\), there is an element \(\sigma\in G_f\) of order \(5\). Now the only elements of order \(5\) in \(S_5\) are \(5\)-cycles, so \(G_f\subset S_5\) contains a \(5\)-cycle.
We claim that any subgroup of \(S_5\) containing a transposition and a \(5\)-cycle is actually the whole group \(S_5\).
Without loss of generality, the transposition is \((12)\) and the \(5\)-cycle is \((12345)\). Then \[(12345)(12)(12345)^{-1}=(23).\] Similarly, we obtain the transpositions \((34)\) and \((45)\).
Finally, \(S_5\) is generated by these adjacent transpositions and we conclude \(G_f\cong S_5\).
\(\;\)
Example 8.7 The polynomial \(f(x)=x^5-6x+3\in\mathbb{Q}[x]\) is irreducible over \(\mathbb{Q}\) by Eisenstein’s Criterion with \(p=3\). Furthermore, \(f(x)\) has exactly three real roots. You could just observe this by plotting a graph, considering \(f(x\)) as a function \(\mathbb{R}\to\mathbb{R}\).

Whilst this is convincing, to check it properly only takes a little bit of analysis:
\(f(x)\) is continuous and changes sign at least \(3\) times \[\lim_{x\to-\infty} f(x)=-\infty,\quad f(0)>0,\quad f(1)<0\quad\text{and}\quad \lim_{x\to+\infty} f(x)=+\infty\] so by the Intermediate Value Theorem, \(f(x)\) has at least \(3\) real roots.
Furthermore, \(f'(x)=5x^4-6\) has only two real roots \(\pm\sqrt[4]{6/5}\) so by Rolle’s Theorem, \(f(x)\) has at most \(3\) real roots.
In particular, the Theorem applies and shows that \(x^5-6x+3\in\mathbb{Q}[x]\) has Galois group \(S_5\) and is hence not solvable by radicals over \(\mathbb{Q}\).
Remark. Unfortunately, we don’t have time to fully discuss the entertaining topic of ruler-and-compass constructions, which has a similar flavour to the above. Briefly, given two points \(P,Q\in\mathbb{R}^2\), a ruler allows us to draw the (infinite) line through \(P\) and \(Q\) and a compass allows us to draw the circle with centre \(P\) and passing through \(Q\). A new point is said to be constructed at the intersection of such lines and circles. Starting from two points \((0,0)\) and \((0,1)\), we say a point \((x,y)\in\mathbb{R}^2\) is constructible if there is a finite sequence of ruler and compass operations constructing \((x,y)\). One can prove
Theorem If a point \((x,y)\in\mathbb{R}^2\) is constructible, then \(x\) and \(y\) are algebraic over \(\mathbb{Q}\) and contained in an iterated quadratic tower \[\mathbb{Q}=K_0 \;\subset\; K_1 \;\subset\; \cdots \;\subset\; K_m=L\] where \([K_i:K_{i-1}]=2\) for each \(i\). In particular, \([L:\mathbb{Q}]\) is a power of \(2\).
This enables us to answer a number of classical problems:
It is impossible to trisect a general angle using ruler and compass constructions. For instance, by a bit of elementary geometry, if we could trisect the angle \(\pi/3\), then it would be possible to construct a line of length \(\cos(\pi/9)\). But this number has degree \(3\) over \(\mathbb{Q}\) which is not a power of \(2\).
It is impossible to double the cube (i.e. construct a cube with twice the volume of the unit cube) using ruler and compass constructions. If we could, then it would be possible to construct a line of length \(\sqrt[3]{2}\). But this number has degree \(3\) over \(\mathbb{Q}\) which is not a power of \(2\).
It is impossible to square the circle (i.e. construct a square with the same area as a unit circle) using ruler and compass constructions. If it was possible, then the sidelength of the square \(\sqrt{\pi}\) would be algebraic over \(\mathbb{Q}\) with degree a power of \(2\), and so \(\pi\) would be algebraic over \(\mathbb{Q}\). But we know (though it’s difficult to prove!) that \(\pi\) is transcendental over \(\mathbb{Q}\).
For which integers can we construct a regular \(n\)-sided polygon with ruler and compass constructions? To do so requires constructing a line of length \(\cos(2\pi/n)\) and so it’s possible only when this number has degree a power of \(2\) over \(\mathbb{Q}\). Gauss discovered (using cyclotomic polynomials) that it’s possible exactly when \(n=2^rp_1\cdots p_s\) where the \(p_j\) are distinct Fermat primes, i.e. of the form \(2^{2^k}+1\). (These are precisely the \(n\) where \(\varphi(n)\) is a power of \(2\).) There are only \(5\) known Fermat primes \[2^{2^0}+1=3,\quad 2^{2^1}+1=5,\quad 2^{2^2}+1=17,\quad 2^{2^3}+1=257,\quad 2^{2^4}+1=65537.\] It is a longstanding open problem if there are any more, though heuristic arguments suggest there are not, and the next smallest would have to have at least a billion decimal digits. In particular, it isn’t possible to construct a regular heptagon, but it is possible to construct a regular heptadecagon (using our solution to Exercise 6.10).
