You ask about the mathematical use of words like "field" and "ring" (p.1). I think that "field", "ring", "group" and "domain" are all cases where mathematicians chose a general word meaning "a collection of things that one can choose from", and gave it a technical meaning. I don't think that any particular differences between the meanings of those words motivated these choices; whoever came first got first pick. So far as I know, all European languages use for "ring", "group" and "(integral) domain" words having the same literal meanings as the English ones; but for "field" they are split: English and Spanish use the word meaning a (farmer's) field, while French and German use the word for "body" (French "corps", German "Koerper"). For the analogous structures but where multiplication may be noncommutative, these languages use modified terms: "skew field", "corps gauche", "Schiefkoerper". But Russian avoids this awkwardness: Someone must have noticed the situation in Western languages, and cleverly assigned their word for "field" (polya) to the commutative concept, and the word for "body" (telo) to the one without an assumption of commutativity. I have sometimes conjectured that the choice of the word "ring" was a pun: A subset of C is a ring if and only if it is "closed" under the appropriate operations. (On the other hand, Z/n can be thought of as ring-like in a different way.) ---------------------------------------------------------------------- You ask whether writing "i = sqrt -1" as on p.2 isn't an abuse of notation. That's a good point, though I wouldn't exactly say it is an abuse of notation: One can only abuse notation after one has decided what the notation will be, and what you've pointed out is that we haven't decided what "sqrt z" should mean when z is a complex number. There are different possible choices. One can, for instance, show that every complex number has a unique square root in the union of the open upper half-plane with the set of nonnegative real numbers. If we decide to call that "sqrt z", then "i = sqrt -1" is correct. In Math 185 one has to make choices like this (defining a "principal value of the square root function"). Here we are instead loosely using "sqrt z" to mean "one of the square roots of z", with the understanding that once this is chosen, the other square root of z will be written "- sqrt z". ---------------------------------------------------------------------- In connection with the discussion on pp.5-6, you ask whether there are other important number systems than those Stewart mentions. The answer depends on how widely one casts one's net. One can think of all rings as "number systems", so that rings Z/n, polynomial rings, rings of functions, rings of matrices, etc. are all examples. On the other hand, if one restricts one's attention to structures that "extend" the objects Stewart names in a natural way, there are much fewer, and I'm afraid they are not, as you say, "indispensable in mathematics". One is the ring of quaternions. As the complex numbers are 2-dimensional over the real numbers, so the quaternions are 2-dimensional over the complex numbers; but they are not commutative. They have a basis over R of 4 elements, 1, i, j, k, and a peculiar multiplication where 1 acts "as one would expect", each of i, j, k has square -1, while the product of two of them that are not the same is just like their cross-product in physics. The cross-product is not associative, but the multiplication of the quaternions is, because of the products i^2 = j^2 = k^2 = -1 that don't work like the cross-product. Quaternions are of some use in geometry, but have not shown themselves nearly as important as the real and complex numbers. There is also an 8-dimensional structure, containing the quaternions, called the octonions, which is not even associative, but still has the property that the product of any two nonzero elements is nonzero. These are less used even than the quaternions. And there things stop: Outside of dimensions 1, 2, 4, 8, it has been proved that one can't define any multiplication (bilinear map) on a finite-dimensional vector-space over R such that a product of nonzero vectors is nonzero. If one doesn't restrict oneself to extensions of R, or to finite-dimensional extensions, then there is an endless bounty of fascinating examples, but in general, each of these is just "one among many". I guess there is one set of class of rings that are of an importance and "uniqueness" that puts them in a class with the reals and the complexes, at least for number-theorists: For each prime number p there is what is called "the ring of p-adic integers". But the construction is not one of which I can give a thumbnail sketch; if you ask in office hours, I can give a 10- or 15-minute explanation of the idea. ---------------------------------------------------------------------- You ask how one would think of the approach to solving the cubic that Stewart sketches on p.8. I don't know how people arrived at it historically, but here is how I would motivate it. Consider the solutions to the linear and the quadratic equations. The former is just a constant that one calculates by arithmetic, while the two solutions to the latter (as I noted the first day of class) are one constant plus-and-minus another: linear: r = rho quadratic: r_1 = rho_0 + rho_1 r_2 = rho_0 - rho_1 (where the quadratic formula tells us in detail what rho_0 and rho_1 are). Now +1 and -1 are the two square roots of 1. The three cube roots of 1 are 1, omega and omega^2 (where omega is a primitive cube root of 1), and fiddling around, one can conjecture the following analogous formula: cubic: r_1 = rho_0 + rho_1 + rho_2 r_2 = rho_0 + omega rho_1 + omega^2 rho_2 r_3 = rho_0 + omega^2 rho_1 + omega rho_2 I started to show on the first day of class (though I ran out of time) that given a cubic t^3 + c_2 t^2 + c_1 t + c_0, if one assumes it factors as (t - r_1) (t - r_2) (t - r_3) and substitutes in the above values, one gets equations saying that 3 rho_0 = c_2, while (rho_1)^3 and (rho_2)^3 satisfy a certain quadratic equation. Solving that equation and taking cube roots, one gets values to use for rho_1 and rho_2, and hence a solution to the cubic. Having gone through these messy calculations, one can use hindsight to simplify the calculation a bit. First, one sees that rho_0 is just an added constant that doesn't affect the way the roots relate to each other, so by changing variables to subtract this constant from t, one can make rho_0 = 0, greatly simplifying the multiplications one has to do in expanding (t - r_1) (t - r_2) (t - r_3). Since 1 + omega + omega^2 = 0, once rho_0 is eliminated the roots r_1, r_2 and r+3 add up to 0; so getting rid of rho_0 is equivalent to making the term c_2 zero. Also, we found that the computation gave us formulas for the _cubes_ of rho_1 and rho_2, i.e., rho_1 and rho_2 are gotten by computing certain cube roots; so we may as well anticipate things by calling them cuberoot(u) and cuberoot(v) when we introduce them. Making these changes, the method I sketched turns into the method Stewart sketched. Hope this helps. ---------------------------------------------------------------------- You ask why one chooses \alpha and \beta near the bottom of p.9 so that 3\alpha\beta + p = 0. \alpha and \beta are supposed to represent the cube root of u and the cube root of v in equation (1.4); so 3\alpha\beta + p = 0 is needed for that equation to hold. ---------------------------------------------------------------------- You ask how Stewart gets the 9 expressions on the bottom of p.9. The preceding computation has shown that any solution of the given cubic will be the sum of a cube root of u and a cube root of v. If we let alpha be one cube root of u, then the others will be omega alpha and omega^2 alpha; similarly if beta is one cube root of v, the others are omega beta and omega^2 beta. So all the sums "a cube root of u plus a cube root of v" are all the 9 combinations Stewart shows. ---------------------------------------------------------------------- You ask about the choice of the three "good" expressions among the nine expressions at the bottom of p.9. Each is a sum of two terms, the first of which is supposed to be the "cube root of u" in equation (1.4) on the preceding page, and the second of which is supposed to be the "cube root of v" there. Suppose alpha and beta are one pair of choices for the cube root which satisfy (1.4). (We can choose these by letting alpha be any cube root of u, and letting beta = -p / 3 alpha.) Then if we take a different cube root of u for use in constructing a different root, say omega alpha, then the cube root of v that goes with it, to still satisfy (1.4), must be beta / omega. Since omega^3 = 1, we have 1/omega = omega^2, so we can write this cube root of v as omega^2 beta. Similarly, for our third root we use omega^2 alpha + omega beta. ---------------------------------------------------------------------- You ask regarding the discussion at the bottom of p.9, "... can one find a formula for the roots of a cubic which only produces correct roots?" Yes. Take Cardano's formula, but in place of the second summand, put -p /(3 times the first summand). This will insure that (1.4) is satisfied, which is what Stewart is concerned with at the bottom of p.9. (The fact that this element will be a cube root of v follows from (1.6), which u and v were chosen to satisfy.) This is really what Stewart does at the bottom of p.9. But the resulting expression, if written in place of Cardano's formula, would lose the beautiful symmetry of that formula. ---------------------------------------------------------------------- You ask about Stewart's statement on p.10, last full paragraph, that Cardano's formula is pretty much useless when there are three real roots. I don't really know why he says that. Maybe the idea of using complex numbers to find real numbers goes against the grain; but it shouldn't to a mathematician ... . ---------------------------------------------------------------------- You ask about "permuting the roots" of a quintic in the discussion on pp.12-13. Stewart doesn't mean this discussion to be something he expects you to follow in mathematical detail; he's just throwing around the ideas to give you a sense of the history. (He will be more precise when we actually start developing Galois theory! But it would be nice if he made this basic idea of permuting the roots clearer.) What he means is that we take some expression in the roots, such as the last display on p.12 or any of the first four on p.13 (for the cubic, the quartic, and the quintic respectively) and replace it by a new expression in which each occurrence of a given root alpha_i is replaced by some fixed alpha_{i'}. For instance, in the last display on p.12, one permutation would replace alpha_1 by alpha_2, alpha_2 by alpha_3, and alpha_3 by alpha_1. One discovers that the effect is to multiply the sum (alpha_1 + omega alpha_2 + omega^2 alpha_3) by omega^2, hence the expression shown, the cube of that sum, is left unchanged. On the other hand, if we interchange alpha_2 and alpha_3 but leave alpha_1 unmoved, the expression on the bottom of p.12 turns into the one on the top of p.13; and we find that every permutation has one of these two effects, i.e., the original expression is always turned into one of a list of two expressions by a permutation of the roots. The expression Stewart gives on p.13 for the quartic is, as I indicated in class, incorrect. Permutations of the roots turn that expression into six different expressions; but they turn the corrected expression that I wrote on the board, (alpha_1 + alpha_2 - alpha_3 - alpha_4)^2 into only three different expressions, which allows one to reduce the quartic to a cubic (but how they allow this is something we won't go into). ---------------------------------------------------------------------- You ask about the explicit solution of a particular quintic referred to on the bottom of p.13. I don't know the details -- Stewart doesn't claim you should be able to see it; on the contrary, he gives a reference, implicitly encouraging any reader who is interested to look up the paper named. (When a piece of mathematical writing contains a phrase like "see Berndt, Spearman and Williams (2002)", this means that you will find in the bibliography an item identifiable by that author-list and that date. The bibliography of this book is on pp.279-282, and this reference is on the first page of that bibliography.) ---------------------------------------------------------------------- You ask why Exercise 1.10, p.15 says that to try to generalize Bombelli's observation (paragraph below bottom on p.10) is "usually pointless". I really don't know what Stewart had in mind. The best I've been able to come up with is that that when that method is applicable, Cardano's formula gives the answer 2 alpha. So assuming alpha was to be taken rational, an analog of Bombelli's method could only work when the cubic had a rational root, which it usually doesn't. However, that doesn't explain why Stewart begins "When 27 q^2 + 4p^3 < 0". ---------------------------------------------------------------------- You ask, regarding the proof of the Fundamental Theorem of Algebra on p.25, "How did Gauss choose \gamma(\theta) = p( r(\epsilon) * e^(i\theta))/ (r(\epsilon)^n + 1)?" Gauss didn't! Stewart says on the middle of p.22 that "The ideas behind the proof we give here (but not their precise expression) go back to Gauss". The idea is one that I described in class in previewing that reading on Friday: Look at p as giving a function C --> C; note that if you traverse a circle around the origin in the domain plane, its image will be some sort of loop in C. Moreover, if the circle has very large radius, so that the values you substitute for t have large absolute value, then the t^n term will be much larger than the lower-exponent terms in p, so the loop you get will look very much like t^n. But from the properties of multiplication in C one can see that as t goes once around the origin, t^n goes around n times; so the same will be true for p(t) when the circle is large. On the other hand, when the circle is very small, p(t) will move very little, and in general won't go around the origin at all. Now if p had no roots, then as we gradually expanded the circle we considered, none of the intermediate circles would cut through the origin, so the number of times those circles went around the origin (their winding numbers) would stay constant as we increased the radius. This would give n = 0, a contradiction. Stewart's contribution is to take these larger-and-larger loops, which can be described as p(r e^{i theta}) for various radii r, and "scale them down" so that they don't go to infinity, by dividing by r^n+1, and also to make the process where the domain circle goes to infinity take place in "finite time" by letting r = epsilon / 1 - epsilon. That gives the formula you ask about. ---------------------------------------------------------------------- You write that you don't see why the display on p.25 saying two limits are equal is true. The author substitutes for gamma_epsilon(theta) the expression by which it was defined earlier on the page. Moreover, one sees that in that expression, the two places where epsilon appears, it is in the expression r(epsilon); so to understand that limit as epsilon --> 1, one just has to think of what r(epsilon) does as epsilon --> 1. What it does is approach infinity; so Stewart writes the limit as one as "r(epsilon) --> 1". One could quibble about this notation, since r(epsilon) is not an independent variable; but I hope that with this explanation, you see what he means. ---------------------------------------------------------------------- You ask about Stewart's reference to uniform convergence in the sentence after the last display on p.25. Well, that forced me to think it through, and the reason he gives is wrong -- the fact that theta ranges over a closed (hence compact) interval [0,2pi] does not force the convergence to be uniform (a concept from Math 104, which is not a prerequisite for this course). Rather, one needs to simply work through the preceding calculation more carefully to verify that it shows continuity of gamma_epsilon(theta) as a function of two variables, and hence gives the equality of winding numbers. You asked whether it isn't sufficient just to show continuity in epsilon and theta separately. It is not. For instance, the function f(x,y) defined to equal xy/(x^2 + y^2) when (x,y) is not (0,0), and to equal 0 at that point, is continuous in x, and continuous in y, but is not continuous as a function of two variables, as can be seen by looking at its behavior on the line y=x, where it is 1/2 everywhere except at the origin, but 0 there. (That's an example one usually sees in Math 53.) However, don't worry -- because this step is essentially a Math 104/Math 185 argument, you are not responsible for it in this course. As we get close to the first midterm, I'll have to decide what, if anything, you are responsible for from this section; but it won't be much beyond the fact that the Fundamental Theorem of Algebra holds. ---------------------------------------------------------------------- You ask whether, in the proof of Lemma 3.5, p.34, there shouldn't be a largest k such that kd divides f and g. If our base ring were the integers, this would be true, but over a field K, every nonzero element has an inverse; so if a polynomial a divides another polynomial b, say with b = ac, then for every nonzero field-element k the polynomial ka will also divide b, since b = (ka) (k^{-1}b). ---------------------------------------------------------------------- You ask about Stewart's choice of definition of "irreducible" on p.36, under which 6t+3 is considered "irreducible". I am unhappy with his choice of definition, but I can see the point of it: He wants "reducibility" to mean that the polynomial can be broken into factors each of which can contribute a nonempty set of roots -- if not over the given field, then possibly over some larger field. If you factor 6t+3 as 6 . (t + 1/2), the factor "6" can't contribute any roots, so he doesn't want this to count as a "reduction". ---------------------------------------------------------------------- In connection with the first sentence on p.39, you ask about the feasibility of using computers to factor polynomials. It's a subject I know nothing about. On the latest (=2nd) homework sheet, in the two unassigned "Not from Stewart" problems, I show how to estimate the size of the coefficients of possible factors. You could work out the size of the bounds obtained, and hence the number of possible factors that would have to be checked. But as to how these can be improved on by more careful arguments, what other methods can serve to cut down the set of factors that need to be checked, and whether people have written software using these ideas, I don't know. ---------------------------------------------------------------------- You ask about the basis for the statement on p.40, next-to-last line of section 3.3, "Clearly, a_1 ... a_s = 1". The polynomial f has two factorizations, one as g_1 ... g_n and another which Stewart has been proved can be written (a_1 g_1) ... (a_n g_n) after showing that the number of terms, originally called s, is equal to n. This shows that g_1 ... g_n = (a_1 g_1) ... (a_n g_n); hence cancelling the g's, 1 = a_1 ... a_n. Since, as mentioned, n = s, this can be written a_1 ... a_n = 1. ---------------------------------------------------------------------- You ask how one can tell from Figure 3.1, p.44 that B and C are multiple zeroes of the polynomial. From the fact that they are zeroes where the derivative is also zero. Note that if f(t) has a zero at t=a, we may write it as (t-a)g(t). Now if we differentiate this and set t = a, the result comes to g(a). (Check this!) Hence if f(t) just has the single zero at t=a coming from the factor t-a, i.e., if g(t) does not also have a zero there, the derivative g(a) is nonzero. This happens at point A, but at points B and C the curve is horizontal, so the derivative is 0, so when we factor f(t) in this way, g(a) must be zero, so f(t) has a multiple zero at a. It is also easy to show that if a real-valued polynomial has a zero of odd multiplicity at t=a, then its graph crosses the axis there, while if it has a zero of even multiplicity, its graph bounces off the axis (it has the same sign on both sides of t=a). So we see that f(t) has a zero of even multiplicity at B, and a zero of odd multiplicity >1 at C. The graphs of y = t, y = t^2 and y = t^3 are familiar examples of these kinds of behavior. ---------------------------------------------------------------------- You ask about Stewart's definition on p.50 of a field extension as a certain kind of function (a monomorphism), rather than referring to the fields themselves. Good question! Conceptually, I would say a field extension consists of two fields K and L together with a connecting monomorphism iota: K --> L. Moreover, in the situations we will be considering, K will most often by given (it will be the field containing the polynomial whose roots we want to adjoin) so the focus will be on L, the field we get by adjoining those roots. So we will often think of L as an extension of the field K. But since a function is considered to determine its domain and codomain, the map iota determines K and L, so many authors, including Stewart, choose for the sake of economy to define the extension as just being that map. The choice is logically satisfactory (everything one wants to say about K, L and iota can be stated in terms of iota, since K and L can be described as its domain and codomain), but I dislike it pedagogically, since the definition doesn't match the way we think of the thing. ---------------------------------------------------------------------- You write > On p.50, line after Example 4.2, it says that we can > usually identify K with its image iota(K). When would > this not be legitimate? Remember when I talked in class about two "stories", one about Mr. Smith meeting Miss Jones in New York and having a hamburger with her, and the other about Mr. Nagata meeting Miss Kobayashi in Tokyo and having sushi with her; and the fact that, if the plot of the two stories was step-by-step identical, one could say that they were "the same story, just using different names"; in particular, that Mr. Smith was "the same as" Mr. Nagata, etc.? But I also said that if someone wrote a novel in which the same Mr. Smith, Miss Jones, Mr. Nagata and Miss Kobayashi were _all_ characters, then we could no longer say that Mr. Smith was the same as Mr. Nagata, because if we did there'd be no way of describing the interaction between them. So in the same way, we can identify one mathematical object with another that is isomorphic to it (in particular, K and iota(K)) as long as we are not dealing with a structure to which they both belong as different parts. E.g., if K and iota(K) were different subfields of C and we wanted to look at equations relating elements of both these fields, we could not identify them. I hope this helps. ---------------------------------------------------------------------- You ask what one can adjoin (as in Definition 4.7, p.51) to R to get Q. Well, the shortest answer is "all elements of R" -- that will certainly do it. Of course, there are smaller sets one can use; e.g., the positive real numbers, or all numbers in the interval [0,1], or all numbers in the interval [0, 0.00001], ... . It's not hard to show that using any of these and elements of Q, one can get all elements of R. A better understanding of this sort of relation between Q and R would go way beyond the reach of this course. One thing one can certainly say is that any set X that, when adjoined to Q, gave Q(X) = R, would have to be uncountable. ---------------------------------------------------------------------- You ask about the expression p(x_1,...,x_n)/q(y_1,...y_n) at the bottom of p.53. That is poor notation on Stewart's part. I suppose his thinking was that since X can be an infinite set, but each polynomial can only involve finitely many variables, each of these polynomials should be written using a finite list of elements of X; and since the variables involved in the numerator may be different from those in the denominator, he should use different symbols x_j and y_j. But if he does so, there is no justification for taking the number of variables in the numerator and denominator the same. Of course, one can force them to be the same, by writing in some variables that aren't actually involved. But in that case, one may as well take the sets of variables in the numerator and denominator to be the same, using the union of those that appear in one and in the other. So the sensible alternatives to what he has are to write p(x_1,...,x_m)/q(y_1,...y_n) with possibly different numbers of variables, or p(x_1,...,x_n)/q(x_1,...x_n) with the same set of variables. ---------------------------------------------------------------------- In your pro forma question you ask why, in the discussion on p.54, i \in L' and \sqrt 5 \in L' imply L is contained in L'; and you base your answer on the statement that every element of L has the form a + bi + c \sqrt 5. But that is not true. Every element of L = Q(i, -i, sqrt 5, -sqrt 5) has the form a + bi + c \sqrt 5 + d i \sqrt 5. You could give an argument based on that fact, but it wouldn't be a robust argument, because if you wanted to reason similarly about field extensions generated by different families of elements, you would have to figure out, in each case, the form that elements of that field have. A better reason is the following: L is defined to be the field generated over Q by i and sqrt 5, i.e., the intersection of all subfields of C containing those elements. So, since L' is one of the fields containing those elements, L is contained in it. Likewise, in proving L' = Q(i + sqrt 5) is contained in L, you claim that every element of L' has the form a + b(i + sqrt 5), which also isn't true. One can show that every element of L' has the form a + b(i + sqrt 5) + c(i + sqrt 5)^2 + d(i + sqrt 5)^3 -- we will be able to see that easily at the end of next week -- but we can get the inclusion of L' in L without knowing this, simply from the definition of L' = Q(i + sqrt 5) as the intersection of all subfields of C that contain i + sqrt 5. Since L is a subfield containing i + sqrt 5, L' must be contained in L. ---------------------------------------------------------------------- You ask what Stewart means on p.58, line 6, by "Separating out terms of odd and even degree ...". He has a polynomial p(t)\in Q[t]. Each term of this polynomial has the form a_j t^j where j is a natural number. Each natural number is either even or odd. Adding up the terms of even degree, one gets a polynomial involving only even powers of t, t^{2h} = (t^2)^h; so it can be regarded as a polynomial in t^2, a(t^2). Adding up the terms of odd degree, one gets a polynomial involving only terms t^{2h+1} = t . (t^2)^h; so it can be regarded as t times a polynomial in t^2: t . b(t^2). Hence p(t) = a(t^2) + t b(t^2). Now Stewart has assumed that p(sqrt pi) = 0. Substituting into the above formula for p(t), and recalling that the square of sqrt pi is pi, one gets a(pi) + (sqrt pi) b(pi) = 0, as he states. ---------------------------------------------------------------------- You ask about the meaning of the "field K(t) of rational expressions" that Stewart uses in Theorem 5.3, p.58. He defines them more precisely on p.175, last paragraph: K(t) is the field of fractions of K[t]. ---------------------------------------------------------------------- You ask about Stewart's excluding m(t) = 0 in the proof of Theorem 5.10, p.61. Well, as I have said, the wording of Stewart's definition of irreducible on p.36 is definitely a mistake; as it is stated it would allow 0 as an irreducible element; as I have modified it, it does not. One wants to make one's definitions correspond to the properties of natural interest, and the role of 0 in K[t] is so different from that of what we are calling irreducible polynomials that it is proper that the definition be chosen so as not to call them by the same name. If one allowed 0 as an irreducible, Theorem 5.10 would be false for the case m = 0. As you guessed, K[t]/<0> is isomorphic to K[t] -- not a field. (Actually, the ideal <0> in K[t] does have one important property in common with the ideals for m(t) an irreducible polynomial. Namely, both are ideals I such that xy\in I => x\in I or y\in I. Such an ideal is called a "prime ideal"; an ideal I of a ring R is prime if and only if R/I is an integral domain. 0 and the irreducible polynomials are the only elements of K[t] which generate prime ideals; as such they are called "prime elements". But we won't be using these ideas in this course.) ---------------------------------------------------------------------- You ask whether we are ever interested in the case of Theorem 5.10 (p.61) where m(t) is reducible. For this course, no. In other contexts yes; for instance, if m(t) is the minimal polynomial of an n by n matrix A, then the ring generated by A over the field K I_n of scalar matrices is isomorphic to K[t]/. ---------------------------------------------------------------------- You ask about how to compute inverses in fields K[t]/, shown to exist in Theorem 5.10 (p.61). The computation is essentially the same as that of Exercise 3.3 in the case where the h.c.f. of that exercise is 1, with m(t) in the role of g(t). In K[t] you get 1 = a(t) f(g) + b(t) m(t), so in K[t]/ this becomes 1 = [a(t)] [f(t)], since the term [m(t)] is 0. Thus, [a(t)] is an inverse of [f(t)]. You also ask about the structure of K[t]/ when m(t) is reducible. There's too much to summarize in a brief e-mail, and the course material doesn't leave me time to go off on tangents like that in class, much as I would enjoy it. The answer depends on whether m(t) has distinct factors or repetitions of one factor, or both. The differences are like those between Z_n when n is a power of a prime, a product of distinct primes, or a product of powers of primes. If you're interested, ask in office hours and I can lead you through some of the ideas. ---------------------------------------------------------------------- You ask (in connection with the vector space structure of an extension field as described on p.67) what you can assume about linear independence of elements of extension fields. Only what you can prove, or what Stewart has proved for you! Note that Lemma 5.14 (p.63) gives a powerful tool. In connection with the example of the square root of 2 and the cube root of 2 (over Q) that you mention, that can be gotten by applying that Lemma with alpha a 6th root of 2. (Do you see how?) ---------------------------------------------------------------------- You ask how much you have to know about cardinals, which Stewart mentions on p.68. You don't. As Stewart says there, if you are not familiar with cardinals, just understand any non-finite-dimensional extension as having degree "infinity", and interpret the tower law when one or more of the degrees is infinity by the simple formulas he gives at the bottom of the page. In the cases we are most interested in the degrees will be finite. If you want to learn about cardinals in the future, you might take Math 135. (Math 104 and Math 55 usually give a tiny bit about the subject -- the distinction between "countable" and "uncountable" -- but there's a lot more to it.) ---------------------------------------------------------------------- You ask, in connection with Example 6.8 on p.71, whether [\Q(a_1, ...,a_n): \Q(a_1,...,a_{n-1})] is either 0 or 2 depending on whether a_n belongs to\Q(a_1,...,a_{n-1}) or not. I assume that by "0 or 2" you mean "1 or 2". The answer is -- yes if a_n is a root of a quadratic polynomial over \Q(a_1,...,a_{n-1}) (and so in particular, if it is a root of a quadratic polynomial over \Q). But if it's a root of a degree d polynomial over \Q(a_1,...,a_{n-1}), all you can say is that it has degree _< d -- namely, the degree will be the degree of its minimal polynomial over \Q(a_1,...,a_{n-1}), which will be a divisor of whatever polynomial you know it satisfies. ---------------------------------------------------------------------- You ask about Stewart's statement on p.71 that the degree of Q{sqrt{6},sqrt{10},sqrt{15}} is 4 and not 8. If you multiply sqrt{6} and sqrt{10}, and simplify as in High School algebra, you'll get an expression in terms of which you can express sqrt{15}. So Q{sqrt{6},sqrt{10}} contains sqrt{15}, so Q{sqrt{6},sqrt{10}} = Q{sqrt{6},sqrt{10},sqrt{15}}. There's no general test for roots of arbitrary polynomials; but you can play around with this case and figure out what properties of this extension by square roots makes it work, and hence find a general result that will include it. ---------------------------------------------------------------------- You ask whether it would be simpler to find the degree [Q(sqrt 2, sqrt 3, sqrt 5): Q] by the Tower Law rather than the method of Example 6.8, pp.71-72. But in that example, Stewart _is_ finding the degree using the Tower Law! The Tower Law expresses that degree as the product of the degrees of three intermediate extensions, and each of those degrees is easily seen to be either 1 or 2. What he then spends most of those two pages doing is showing that those degrees are 2, not 1. For this one needs to know that the extension field is strictly bigger than the base field; i.e., that the new element one adjoins at each stage is not in the field one already has; and that fact is what all the calculations he gives are aimed at getting. ---------------------------------------------------------------------- You ask on what basis Stewart implicitly assumes on p.79 that [K_{j-1}(x_j,y_j) : K_{j-1}(x_j)] = 1 or 2. The fact that y_j satisfies a quadratic equation over K_{j-1}. Stewart has written down explicitly a quadratic equation satisfied by x_j at the top of the page. He notes there, "The same holds for the y-coordinates. ---------------------------------------------------------------------- You ask about the meaning of the phrase "duplicating the cube" on p.80, Theorem 7.5. Stewart gives you the meaning of this and the other classical problems on p.75, end of next-to-last paragraph, "These ask, respectively, for ...". ---------------------------------------------------------------------- You ask why the field K_0 is taken to be Q in Theorem 7.5, p.80. Well, if we are going to duplicate the cube, we only need to be given the length of one side of the given cube, and from it compute the length of one side of the doubled cube. Given a segment representing one side of the given cube, the easiest way to choose coordinates is to take one end of the segment to be the origin and the other end to be (1,0); and if we do so, we find that K_0 = Q(0,1) = Q. (In contrast, when we want to trisect an angle, we must have more than just a line-segment to represent the original angle.) ---------------------------------------------------------------------- You ask whether, on p.92, top paragraph, alpha_2 ^2 when applied to a complex number x+iy doesn't give (x-iy)^2, which is not the same as x+iy. No, alpha_2 ^2 (x+iy) does not mean the result of taking alpha_2 (x+iy) and squaring it; it means the result of applying the square of alpha_2 to x+iy. Here "the square of alpha_2" means the result of composing the function alpha_2 with itself. Since alpha_2 of a complex number gives the conjugate of that number, alpha_2 ^2 of a complex number is the conjugate of the conjugate, i.e., the original number. It's true that in some contexts, putting an exponent on a function symbol means taking the operation that applies the function and then forms a power of the result; e.g., sin^2 x means (sin x)^2. But when one is talking about a set of operations under the operation of composition, exponents are always understood to refer to composition of the operations with themselves the indicated number of times. ---------------------------------------------------------------------- You ask whether, in finding the permutations of the roots of a polynomial that determine the Galois group as on pp.91-92, we "only have to make sure that each alpha in Galois(L:K) sends a_i to a root of the minimal polynomial of a_i over K but, of course, only when that root is in L?". No. As I showed in class, the only permutations of the roots of the polynomial t^3 - 2, namely 2^{1/3}, omega 2^{1/3}, omega^2 2^{1/3}, that determine elements of the Galois group of the extension Q(omega, 2^{1/3}) : Q(omega), i.e., that respect algebraic relations, are the cyclic permutations, even though a transposition that interchanges two roots and leaves the third one fixed does preserve minimal polynomials. On the other hand, for Q(omega, 2^{1/3}) : Q, such a transposition does induce a member of the Galois group. ---------------------------------------------------------------------- You ask, in connection with the definitions of p.93, whether M^star dagger = M^star dagger star dagger must always hold. Yes! In fact, even more strongly, M^star = M^star dagger star always holds. (So the equation you ask about can be obtained by applying "dagger" to both sides of this one.) And one doesn't need to know any field theory to prove this -- just the fact that we have two sets called L and Gamma, and some concept of an element of Gamma "fixing" an element of L (we don't have to know what this means), and that M^star is defined as the set of elements of Gamma that "fix" all elements of M, and H^dagger is defined as the set of elements of L "fixed" by all elements of H. Starting with that, the rest is just simple logical reasoning. (One doesn't even have to assume M a subfield of L; just a subset.) Exactly the same reasoning shows that H^dagger = H^dagger star dagger. There are many other cases of this pattern in mathematics. Based on the famous case we are studying in this course, the pattern is called a "Galois connection". I discuss it in my Math 245 course notes, /~gbergman/245 namely, in Chapter 5, starting on p.141, near the bottom. (I use "*" there for both the operators that Stewart calls "star" and "dagger".) ---------------------------------------------------------------------- You ask about cases where H^{dagger star} is strictly larger than H (p.93). In the situation considered in Galois Theory -- where L is a finite algebraic extension of K -- that cannot happen; we just aren't ready to prove this yet. For infinite extensions, algebraic or transcendental, it can. For instance, let K = Q and L = Q(sqrt 2, sqrt 3, sqrt 5, sqrt 7, ...), the extension gotten by adjoining the square roots of all primes. Let H be the set of those automorphisms which act by reversing the signs of the square roots of finitely many primes only, leaving the other signs unchanged. (That is, for every finite set of primes, the automorphism that changes the signs of the square roots of the primes in that set belongs to H, and those are all the elements of H.) It is not hard to show that H is a group, and that H^dagger = Q, essentially because every element of L involves square roots of only finitely many primes. But H^{dagger star} = Q^star, and this consists of automorphisms that change the signs of the square roots of arbitrary subsets of the primes; so it is properly larger than H. ---------------------------------------------------------------------- You ask what [L:K] is for L and K as on p.95. We'll have the tools to answer that when we get to reading #22! ---------------------------------------------------------------------- You ask what map is described in the second line of Lemma 8.11, p.97, as having kernel A_n. The map from S_n to its cyclic quotient group of order p = 2. Another way of saying this is that if N is a normal subgroup such that S_n / N is cyclic of prime order p, then p = 2 and N = A_n. ---------------------------------------------------------------------- You ask about the notation (a b) on p.97. An expression (a_1 ... a_k)\in S_n means the cyclic permutation of {1,...,n} which sends a_1 to a_2, a_2 to a_3, etc., and finally a_n to a_1. Here a_1,...,a_k are elements of {1,...,n}. ---------------------------------------------------------------------- You ask why, in the top paragraph of p.99, A_n fixes h_e and h_o. The general principle is that if a group G acts on a set X, and if N is a normal subgroup of G, then the set of elements of X that are not moved by elements of N is closed under the action of G. For let g be any element of G, and y any element of X that is not moved by N. Then if we apply any n\in N to gy, we get n g y = g (g^-1 n g) y. Since N is normal, g^-1 n g belongs to n, so by assumption it does not move y, so n g y = g y, showing that g y is also not moved by n. Now A_n is a normal subgroup of S_n, so the above principle shows that if h is not moved by elements of A_n, then sigma(h) (which Stewart for no good reason is calling h^sigma on this page) will also not be moved by A_n, so the same is true of 1/2 (h + h^sigma) and 1/2 (h - h^sigma). ---------------------------------------------------------------------- You ask why, on p.99, line after second display, he says that A_n and sigma generate S_n. The index [S_n : A_n] is 2, so there are no intermediate groups, so any subgroup of S_n that properly contains A_n must be S_n itself. Since sigma is not in A_n, the subgroup that it and S_n generate properly contains A_n, so it is S_n. ---------------------------------------------------------------------- You ask why on p.99, 3rd display, K(delta) is contained in K_2. The author has shown that alpha_1 is in K(delta), so we have K \subset K(alpha_1) \subset K(delta). But (K(delta):K) = 2, a prime, so by the Tower Law, K(alpha_1) must equal either K or K(delta). It was assumed that alpha_1 is not in K, so K(alpha_1) = K(delta). But K(alpha_1) = K_1 \subset K_2, giving the inclusion you ask about. ---------------------------------------------------------------------- You ask about the term "radical extension" used in the statement of Theorem 8.14, p.100. The errata on the homework I gave out on Friday tells you that before reading this theorem, you should read the definition of "radical extension" on p.153. ---------------------------------------------------------------------- You ask where the "c" came from in the 4th display on p.101. P_j and P_k (I hope you made the correction shown on the homework sheet, from Stewart's "P and P_j") are being assumed non-coprime; but they are both monic, so each must divide the other. Since they have the same degree, the only way this can happen is if one is a constant times the other. "c" is that constant. ---------------------------------------------------------------------- You ask what Stewart means by "symmetry under S_n" in the proof of Lemma 8.18 on p.102. This is the concept that he discussed at the beginning of the section. S_n acts on polynomials and rational functions by permuting the subscripts; of one of these elements is invariant under all these permutations, it is called a "symmetric" polynomial or rational function. "Symmetry" means the property of being symmetric. ---------------------------------------------------------------------- You ask about replacing the definition of Sigma being the splitting field of f over K on p.108 by a single condition, saying that Sigma is generated over K by all the zeroes of f. This would work for subfields of the complex numbers, where "all the zeroes" means "all the zeroes in C". But although Stewart insists that the only fields we are considering now are subfields of C, he is looking toward the part of the book where he will allow more general fields; and in that situation, there is not one big field where everything lies and every polynomial splits, but different ways of constructing extension fields where various polynomials split; so it won't make sense to talk about a single set comprising "all the zeroes". It is to give a definition that will continue to work in that situation that Stewart uses the form he does. ---------------------------------------------------------------------- You ask whether the statement at the end of Example 9.2, p.108 doesn't contradict Definition 9.1. A fact can't contradict a definition. A definition says how a word is used, in this case, the word "splitting field". The only thing that could contradict a definition would be if the author used the word "splitting field" in a situation where the conditions stated didn't apply. ---------------------------------------------------------------------- You ask whether condition 2 in the definition of splitting field on p.108 is important. Yes, definitely! For instance, Lemma 9.5 would not be true without it. Neither would the fact that I sketched at the end of class yesterday, that the splitting field of a polynomial f has enough automorphisms to carry every root of f to every other. For instance, Q(2^{1/2}) is the splitting field of the polynomial t^2 - 2 over Q, and it has an automorphism interchanging the roots of that polynomial. On the other hand, Q(2^{1/4}) is not a splitting field, and even thought it contains both roots of t^2 - 2 (namely, the square of 2^{1/4} and the negative of that element) it has no automorphism interchanging them: one of them (the positive one) has a square root in the field while the other doesn't, making such an automorphism impossible. ---------------------------------------------------------------------- You ask whether Stewart's use of sigma_i for the zeroes of f on the first line on p.109 is because we will be considering permutations of these elements. No; he simply considers any lower-case Greek letter fair game as a symbol for an element of an extension field. It would have been better to have used a letter near the beginning of the Greek alphabet. ---------------------------------------------------------------------- Regarding Definition 9.10 on p.112, you ask > What is a simple zero? I am pretty sure that it is just a zero with > multiplicity 1, but I couldn't find a definition for it in Stewart > (his index is all but worthless). His index is as good as that of most math texts! It's true that it doesn't have a listing for "simple", but since you suspected that it means "multiplicity 1", you should have then looked for "multiplicity" in the index. It sends you to p.44, where the discussion of the subject begins; the formal definition is on p.45, and you can see there that "simple zero" is defined as you suspected. ---------------------------------------------------------------------- You ask, regarding the proof of Lemma 9.13 on p.113, whether when Stewart uses the term "irreducible factor of g", this should be assumed to be of degree greater or equal to 1. Right. As shown in my correction to the definition of "irreducible", the words "of positive degree" need to be added to that definition to fit the way Stewart actually uses the term. ---------------------------------------------------------------------- You ask why in the proof of Lemma 9.13, p.113, Stewart can say "by induction g and Dg are coprime". Well, you have to look back in the proof and see whether he sets up an induction. And in fact, he says at the beginning of the paragraph that he will prove the result by induction on curly-d f, the degree of f. So you should check: will g necessarily have lower degree than f ? Does the hypothesis of the statement that is being proved by induction hold for g ? If so, then he can assume inductively that the conclusion of that statement also holds for g . (If, in looking back, you found that he did not explicitly set up any induction, then you would have to ask yourself "Is there some induction he might be considering to be straightforward, so that he can use it without formally setting it up?" But in this case he did formally set it up, so that isn't a problem.) ---------------------------------------------------------------------- You ask how, when we apply Lemma 9.13 to proving Prop. 9.14 on p.113, we know that f and Df will have a common factor over K, and not just over Sigma. In the sheet of corrections that I gave out at the beginning of the course, it says to correct "Sigma[t]" to "K[t]" in the last line of the statement of Lemma 9.13. If you haven't made those corrections yet, please do so -- understanding the material depends on it! (And reading the proof of the lemma, you'll see that Stewart really does prove that there is a common factor in K[t].) ---------------------------------------------------------------------- You ask what contradiction Stewart gets at the end of the proof of Proposition 9.14, p.114. Though he doesn't state it explicitly, the conclusion that f is a constant contradicts its being irreducible. (The correction that I made to the definition of irreducible really does correspond to the way he uses it -- though he says after the definition that a constant polynomial will be considered irreducible, throughout the rest of the book he requires irreducibles to have positive degree.) It also contradicts the assumption that f has a common factor of positive degree with Df -- a nonzero constant polynomial can't have a factor of positive degree. ---------------------------------------------------------------------- You ask whether we will need to consider infinite Galois groups or extensions, since the counting principles used depend on finiteness (p.117). Not in this course! The bijectivity of the Galois correspondence we will prove does not hold in the infinite case; but one can obtain a bijective Galois correspondence by putting a topological structure on the Galois group, and showing that intermediate fields correspond to topologically closed subgroups of that group. This is done with the help of the fact that every normal extension is a union of its finite normal subextensions. I develop this result whenever I teach Math 250B; cf. /~gbergman/grad.hndts/infGal+profin.ps . ---------------------------------------------------------------------- You ask what is meant in the first sentence after display (10.3) on p.118, which says that the relations are linearly independent "unless lambda_1(y) = lambda_2(y) = lambda_3(y), and we can choose y to prevent this." Equation (10.3) depends on our arbitrary choice of y; one might say that it is not one equation, but a system of equations, one for each y. Some of these equations may be linearly dependent on (10.1), but since lambda_1 and lambda_2 are distinct monomorphisms, there exists an element y such that lambda_1(y) \not-= lambda_2(y). Choosing such a y gives us an equation (10.3) that is not linearly dependent on (10.1). ---------------------------------------------------------------------- You ask how Stewart gets the final "= 0" in the first display on p.121. From equation (10.6), substituted into the bracket on the preceding line. (But if I have time I will show in class tomorrow a more transparent way of getting the result of this calculation.) ---------------------------------------------------------------------- You ask how the first display on p.121 shows that g_1, ..., g_n are linearly dependent. Compare the first and last steps -- it shows that y_1 g_1 + ... + y_n g_n = 0. ---------------------------------------------------------------------- You ask how Stewart can conclude that each of the coefficients in the third from last display on p.121 is 0, as expressed by the display after that. He explains this in the line in between: The preceding display is a system of linear relations like (10.8), but (10.8) was assumed to have the smallest number of nonzero coefficients among all nontrivial systems of relations of that form. But the new display has more nonzero coefficients, so it cannot be a nontrivial relation; it can only be the relation with all coefficients zero. This is exactly like the proof of Lemma 10.1, where equation (10.4) was assumed to have the minimal number of terms, so that in the third display on p.119, all the lambda_i(x) have to have coefficient zero. ---------------------------------------------------------------------- You ask where the g_j's went in the first display on p.122. Notice that he says on the preceding line "with j=1", and see the end of the very first sentence of the proof, on p.120. ---------------------------------------------------------------------- You ask whether there is a more straightforward way to do the proof of Theorem 10.5 (pp.120-122). Not a fundamentally simpler way -- it really seems to be a "magical" proof. But there are lots of little details that can be done more nicely than Stewart does. I hope to show some tomorrow. As one example, since linear relations like (10.8) can be multiplied by any element of L and still remain valid, one can assume without loss of generality that y_1 = 1. Then the messy business of multiplying that equation and the one that follows by the first coefficient of the other can be avoided; they will both have first coefficient 1 and one can simply subtract one from the other. ---------------------------------------------------------------------- You ask how to see that there are only 4 candidates for Q-automorphisms in Example 10.7(2), p.122. He has just noted that there are only 4 values to which a Q-automorphism might send zeta. Since zeta generates the field over Q, once we specify where zeta is sent, this determines where every element is sent. E.g., if zeta is sent to zeta^2 as in "alpha_2", then s zeta^3 must be sent to s (zeta^2)^3 = s zeta. ---------------------------------------------------------------------- You ask why the linear relations among 1, $\zeta$, $\zeta^2$, $\zeta^3$, $\zeta^4$ are generated by $\zeta + \zeta^2 + \zeta^3 + \zeta^4 = -1$ (p.123). A linear relation among these elements corresponds to a linear combination of 1, t, t^2, t^3, t^4 that gives 0 when zeta is substituted for t. Such a polynomial must be a multiple of the minimal polynomial of zeta, and that polynomial has degree 4, so the only multiples of it in Q[t] that are linear combinations of 1, t, t^2, t^3, t^4 are multiples by constants, i.e., constant multiples of the one equation Stewart gives. ---------------------------------------------------------------------- You ask about Stewart's statement on p.123 that it is easy to find the fixed field of the group he has just described. The easiest way is to use {zeta, zeta^2, zeta^3, zeta^4} as basis for the extension, so that a general element can be written uniquely as x = q zeta + r zeta^2 + s zeta^3 + t zeta^4. Then the automorphisms listed on p.122 will all fix x if and only if q = r = s = t. So any element of the fixed field has the form q zeta + q zeta^2 + q zeta^3 + q zeta^4 = q (zeta + zeta^2 + zeta^3 + zeta^4) = q (-1) \in Q. ---------------------------------------------------------------------- You ask how Theorem 11.3 (p.126) can be used to construct explicit automorphisms. To see that, you have to go back to the proof of Theorem 9.6, which it calls on. And to see how that works, you need to go back to the proofs of the results it calls on in turn. ---------------------------------------------------------------------- You ask in connection with Proposition 11.4 on p.126 whether a K-automorphism sigma of a field extension L of a field K can send a zero alpha of an irreducible polynomial f\in K[t] to an element of L that is not a zero of f. Definitely not! Since a sigma is a K-automorphism, it respects the field operations of L and fixes elements of K. Since f(alpha) is computed using the field operations of L and the coefficients of f, which are elements of K, we have sigma(f(alpha)) = f(sigma(alpha)). The left-hand side is sigma(0) = 0. Equating the right-hand side to 0, we see that sigma(alpha) is also a zero of f. ---------------------------------------------------------------------- You ask why, in Chapter 11 we consider K-monomorphisms and not just K-automorphisms. I hope I made this clear in class: We need to use K-monomorphisms in "building up" the K-automorphisms that are our ultimate interest. Thus, Corollary 11.11, about K-automorphisms, couldn't be proved without the inductive construction of K-monomorphisms in Theorem 11.10 (p.128). As to why Stewart gives Theorem 11.13, about K-monomorphisms, after he has already gotten Corollary 11.11, I don't really know. Perhaps he will use it in a later chapter. ---------------------------------------------------------------------- You ask how Proposition 11.4 is used on p.129, line 4. For each alpha_i that Proposition gives an automorphism taking alpha to alpha_i. So altogether it gives the set of automorphisms described. ---------------------------------------------------------------------- You ask how we know that the phi_ij, defined in the first display on p.129, are distinct. Good question! Consider two such maps phi_ij and phi_i'j', with (i,j) not-= (i',j'). If i not-= i', then by construction tau_i and tau_i' have different effects on alpha. But rho_j and rho_j' are K(alpha)-automorphisms, so they both fix alpha. Hence the composite maps tau_i rho_j and tau_i' rho_j' act on alpha in the different ways that tau_i and tau_i' do, so they are unequal. This leaves the case i = i'. Since we are assuming (i,j) not-= (i',j'), we must have j not-= j'. Hence by inductive assumption, rho_j not-= rho_j'. When these are composed with the same automorphism tau_i = tau_i', the composites must also be distinct. ---------------------------------------------------------------------- You ask, in connection with the true/false question 11.7(d) on p.131, for an example of an extension with Galois Group of order 1 that is not normal. The extension gotten by adjoining a cube root of 2 to Q. (This was the first example we saw of a non-normal extension.) ---------------------------------------------------------------------- You ask how, on p.134, second line, we are to deduce normality of L:M from Theorem 9.9. By applying that theorem twice: The theorem first tells us that L is the splitting field over K of some polynomial over K, since it is assumed normal over K; we see from this that M is also the splitting field of f over M; hence by a second application of the theorem, it is normal over M. ---------------------------------------------------------------------- You ask what the 2-headed arrow on the upper right in Figure 13.1, p.138 indicates. It is marked "tau", and it means that the automorphism tau reflects the picture over the diagonal line. Likewise, the four bent arrows represent sigma, and show that sigma rotates the square of roots 90 degrees counterclockwise. ---------------------------------------------------------------------- In connection with Stewart's statement at the end of p.140 that C^dagger, D^dagger and E^dagger are not normal, you ask for an irreducible polynomial with a root in the last of these fields that does not split in that field. Since C^dagger and E^dagger are complex conjugates of one another, and I discussed the former a bit in class, I'll answer for it instead. I noted that the generator (1+i) xi of C^dagger, when squared, gave -2i xi^2. From this one can deduce that its 4th power is -8; i.e., it is a zero of t^4 + 8. Now I claim that t^4 + 8 is irreducible, and that C^dagger does not contain (1-i) xi, which is also a zero of t^4 + 8. There are various ways to get these two facts. E.g., if t^4 + 8 were reducible, then (1+i) xi would have degree < 4, contradicting the fact that we know the field it generates has degree 4; and (1-i) xi does not belong to that field by the condition a_1 = a_5 on that page. ---------------------------------------------------------------------- You ask about extending the definition of solvable group (p.143) so as to allow an infinite chain of proper subgroups (presumably, having intersection the trivial subgroup). Such a concept can be defined, but it has much weaker properties than that of solvability. In particular, a homomorphic image G/N of a group G with that property need not have it. In fact, any group whatsoever can be written as a homomorphic image of what is called a free group, though every free group has an infinite chain of normal subgroups with abelian factor groups, and with intersection the trivial subgroup. The reason this is possible is that distinct subgroups of a group, even a family of distinct subgroups which intersect trivially, can have the same homomorphic images in G/N. (For an example of this phenomenon, consider the subgroups Z > 2Z > 4Z > 8Z > ... of the integers, and what happens when we map Z to Z/3Z. Even though the original subgroups have trivial intersection, their images in Z/3Z do not.) Anyway, to say a group G has a series of subgroups of the sort you describe can be shown equivalent to the statement that for every nonidentity element g of G, there is a solvable homomorphic image G/N in which g has nonidentity image. A group with that property is called "residually solvable". ---------------------------------------------------------------------- In connection with the warning at the bottom of p.143, that given a chain of three groups, each normal in the next, the first may not be normal in the third, you ask whether, if the first and second are normal in the third, the first will be normal in the second. More is true: As long as the first is normal in the third and contained in the second, it will be normal in the second. This follows immediately from the definition of normality. ---------------------------------------------------------------------- You ask about the first isomorphism symbol in the next-to-last display on p.145. This is an application of (what Stewart numbers as) the First Isomorphism Theorem, with the order of the two sides reversed. (With this information you should be able to see what groups have the roles of the H and A of that theorem.) ---------------------------------------------------------------------- You ask in connection with the concept of simple groups (p.146) whether there is a relation between these and simple extensions. No. "Simple" is used in both definitions the sense of its Latin root, meaning "one-fold". But in the definition of "simple group" there is a very strong concept of "one-fold-ness": the group represents a link in a chain of normal subgroups where no further subdivision is possible. In the definition of "simple extension" we have a different, and rather weak sort of one-fold-ness: an extension that can be generated by one element. ---------------------------------------------------------------------- You ask how, in the proof of Theorem 14.7, p.147, one concludes that if a normal subgroup N of A_n contains one 3-cycle it contains all. First, Stewart observes that "without loss of generality" the one 3-cycle that we know it contains can be assumed to be (123). This is because the elements 1, 2, 3 of the set which our permutations act on are no different from any others so far as the definition of A_n is concerned; so if N <| A_n contains a different 3-cycle (pqr), we could go through the same proof using p, q, r everywhere in place of 1, 2, 3. Then, assuming (123)\in N, Stewart proves that any 3-cycle (abk) is in N. The key calculation is the second display on that page (corrected as noted in the homework due right after the first midterm) and the preceding sentence. ---------------------------------------------------------------------- You ask about the assumption at the top of p.148 that "without loss of generality" N contains an element of the form (123)(456)y where y fixes 1, 2, 3, 4, 5, 6. Stewart is considering there the case where N contains an element whose cycle-decomposition contains two 3-cycles; in other words, an element of the form (abc)(def)y where y fixes a, b, c, d, e, f and these six elements are distinct. Now in the set of elements which S_n permute, any set of six distinct elements is like any other -- there is nothing in our assumption that singles one out. So if we can prove the result in the case where our a, b, c, d, e, f are 1, 2, 3, 4, 5, 6, then we can prove it for any a, b, c, d, e, f by just writing a, b, c, d, e, f for 1, 2, 3, 4, 5, 6 respectively in our proof. ---------------------------------------------------------------------- You ask about an algorithm for finding the conjugacy classes (defined on p.149) in a group G. If G is given by its multiplication table, this is easy: It is not hard to check that two elements of a group are conjugate if and only if they can be written as xy and yx respectively for some x and y in the group. So to find all the conjugates of an element a, one takes the multiplication table, looks for the occurrence of a in each column, reflects the locations of these occurrences about the diagonal of the multiplication table, and the elements in the reflected positions are the conjugates of a. (Of course, if one has a computer that can multiply elements of a, this is not much easier than using the original definition, and getting the computer to list the elements b a b^{-1} for all b\in G. Either way, it's O(n) steps. I assume that it is clear that to get a list of the conjugacy classes, one computes the class of one element, takes all members of that class out of consideration, computes the class of the first element remaining, and so on until one has eliminated all the elements.) But that is just one of the possible ways a group could be described. As another example, if one takes the group of all n x n matrices over an algebraically closed field, the conjugacy classes correspond to the distinct Jordan canonical forms (with no zeroes on the diagonal to make these matrices belong to the group). That is really what Jordan canonical form is about, and it obviously took some insight into the structure of the group in question to find it -- not some mechanical procedure! If one's base field is not algebraically closed, there is something called the "rational canonical form", which is messier; it is often described in Math 110 texts but not covered in the course. ---------------------------------------------------------------------- I hope what I said in class clarified your final question about the proof of Cauchy's Theorem on p.150. In the relation $|C_j| = |G|/|C_G(x)|$ since $p$ divides the numerator of the fraction, but not the left-hand side of the equation, it must divide the denominator of the fraction. ---------------------------------------------------------------------- You ask for an example where the analog of Cauchy's Theorem for composite divisors is not true, other than A_5 noted in Exercise 14.6, p.151. Well, there are two ways one could try to generalize Cauchy's Theorem to composite divisors. In his comment at the bottom of p.150, Stewart tacitly translates Cauchy's Theorem as saying that G has a _subgroup_ of order p, making A_5 and n=15 an example of a group G and a divisor n of |G| such that G has no subgroup of order n. A simpler example of this is G = A_4, n = 6. But if one takes the literal statement of Cauchy's Theorem, then for its generalization to fail one merely wants a group G and a divisor n of |G| such that G has no element of order n. For this, any finite non-cyclic group will do! ---------------------------------------------------------------------- You ask about the difference between Definition 15.1 (p.153) and the concept of solvability by Ruffini radicals (p.96). There's not very much. I guess Stewart used the latter term if an extension that was given turned out to be a radical extension in the sense of Definition 15.1, while he uses Definition 15.1 even when the radical extension is not the extension we are interested in for itself, but a field containing it. But that's not a very mathematical distinction. I'll point this out to him. And on Monday or Wednesday I'll give an example where the splitting field of a polynomial is contained in a radical extension but is not itself radical. (I really already did give one: any irreducible cubic with all three roots real. But I'll give a concrete example where we can see the structure.) ---------------------------------------------------------------------- You ask about the definition of "radical degree" at the bottom of p.153. You're right that this definition suggests that if n_j is a radical degree for alpha_j, then any multiple of it will also be one, and that from this point of view a more useful concept might be the least n_j with the indicated property. But I think that the real idea here is that we are specifying a sequence of elements and integers, (alpha_1, n_1, ... , alpha_m, n_m) satisfying the condition of the preceding display, and that we then manipulate such sequences, doing such things as replacing a given pair alpha_j, n_j by a sequence of pairs that make each integer in question a prime (as in Prop.8.9), etc.. So the "radical degree of alpha_j" just means whatever integer we have after alpha_j in the sequence we are working with at the moment. Note, as an example, that if we are given two radical extensions of K, say K(alpha_1, ... , alpha_m) and K(beta_1, ... , beta_n), and we want to show as I did in class that their composite K(alpha_1, ... , alpha_m, beta_1, ... , beta_n) is a radical extension, it is most convenient to use the exponents that we already associated with the beta's, even though, after adjoining the alphas, smaller exponents may work. ---------------------------------------------------------------------- You ask whether Lemma 15.5, p.155, remains true if p is replaced by an integer n that is not a prime. Yes, it does; but then it takes more work to show that the group of nth roots of unity is cyclic, as needed by the proof. This is not so hard when we are working with subfields of the complex numbers, since e^{2 pi i / n} will be a generator; but harder in an abstract field, such as Z/qZ for q an arbitrary prime. We'll prove this in Theorem 20.8. ---------------------------------------------------------------------- You ask about the statement on p.156, 6th line of proof of Lemma 15.7, that if alpha_1\in K, so that L=K(alpha_2,...,alpha_n), the result of the lemma holds by induction. Notice that the second paragraph of the proof began "We prove the result by induction on n"; i.e., on the number of elements that are adjoined (each having a prime power in the field generated by those that precede) to get L from K. So in this induction we assume the result is true of all extensions L:K that can be gotten in this way using < n such elements. Now if alpha_1 \in K, then L=K(alpha_2, ...,alpha_n) so L can be gotten from K using < n such elements, hence by that inductive assumption, the conclusion is true. ---------------------------------------------------------------------- You ask about the statement on p.156, 6th from last line of text, that if we set epsilon = alpha_1/beta, then epsilon^p = 1. By assumption, alpha_1 has pth power in K, and beta is a zero of the minimal polynomial of alpha_1. Now since t^p - (alpha_1)^p is a polynomial with coefficients in K (by assumption on alpha_1) which is satisfied by alpha_1, it is a multiple of the minimal polynomial of alpha_1, which by assumption also has beta as a zero. So beta^p - (alpha_1)^p = 0. Hence epsilon^p = (alpha_1/beta)^p = (alpha_1)^p / beta^p = 1. ---------------------------------------------------------------------- You ask about the notation $M(\alpha_1)(\alpha_2, \ldots, \alpha_n)$ in the second display on p.157. It means the same as $M(\alpha_1, \alpha_2, \ldots, \alpha_n)$. The author is writing it as he does to emphasize that we may look at $L$ as gotten by taking the field $M(\alpha_1)$ and adjoining $\alpha_2, \ldots, \alpha_n$ to it. ---------------------------------------------------------------------- You ask whether the proof of Theorem 15.3 (pp.155-157) carries over in a straightforward way to fields not contained in C. Yes. A difference, which has been mentioned in class several times, is that working within the complexes, "normal closure" means the least subfield of the complexes containing our given field and normal over our base field, while for general fields, it is something that one constructs abstractly by adjoining enough roots to an appropriate polynomial so that it splits. The one other difference is where Stewart refers to properties of t^p - 1 in Lemma 15.5, and the proof of Lemma 15.7. In these, t^p - 1 split for a nontrivial reason, which is still valid in abstract fields whenever p is not the characteristic; while if p is the characteristic, then t^p - 1 splits for a trivial reason: because it equals (t-1)^p. ---------------------------------------------------------------------- You ask why, in the first line of the proof of Theorem 15.3 on p.157, M:K_0 will be radical if M:K is. Note that K_0 as defined here contains K and is contained in M. Hence if M is generated over K by a "radical sequence" (last line of p.153) then it will be generated over K_0 by the same sequence, and from the definition of radical sequence, a sequence that is a radical sequence over K will still be one over the possibly larger field K_0. ---------------------------------------------------------------------- You ask in connection with Definition 15.8 (p.158) whether normality is of interest outside the fact that it characterizes splitting fields; and in particular, whether it is of interest for infinite extensions. Yes; in fact I would say that the reason for the emphasis on splitting fields is that they have the property of normality. If normality were not so powerful, then the natural extension of a field K to associate to an irreducible polynomial f would be K[t]/, the field gotten by adjoining a single root of f. Note that the Fundamental Theorem of Galois Theory is true for finite normal separable extensions, but not for other finite extensions. Infinite normal extensions are indeed studied; however to state and prove a version of the Fundamental Theorem of Galois Theory for these requires topological as well as algebraic concepts, so it is not done in courses like Math 114 or even 250A (though I develop it in Math 250B, which gives the instructor a lot of leeway on what to include). ---------------------------------------------------------------------- You ask how Theorem 15.9 (p.158) is a "restatement" of Theorem 15.3. This assertion of Stewart's is a bit sloppy. Theorem 15.9 is a restatement of the case of Theorem 15.3 where L:K is normal, and hence is a splitting field of some polynomial. ---------------------------------------------------------------------- You ask whether Stewart's assertion "(this also follows by irreducibility)" near the bottom of p.159 is based on separability of irreducible polynomials over C. Right! ---------------------------------------------------------------------- You ask about Stewart's statement on p.163 that "... the machinery needed to prove the existence of an algebraic closure is powerful enough to make the concept of an algebraic closure irrelevant anyway..." He isn't referring to what he does in this chapter, but in Chapter 17, for which Chapter 16 is preparation. The gist of what he means is that the process of constructing an extension of an arbitrary field K in which all polynomials have roots (an algebraic closure) begins with the construction of extensions in which we get roots of an arbitrary finite set of polynomials; but this is all that Galois theory needs, so for the purposes of Galois theory we may as well stop there. (What he doesn't say is that to go from that construction to the construction of an algebraic closure of K also requires considerable set-theoretic machinery that we don't need for Galois theory; so it is really quite fortunate that we can do without it.) ---------------------------------------------------------------------- You asked about integers modulo n when n is not positive, as in the uncorrected version of line 4, p.165. Whatever the value of n, it generates an ideal nZ, so one can look at the quotient ring Z/nZ. In particular, when n is negative, say -m where m is positive, this is what is usually written Z/mZ or Z_m, while when n = 0 it is isomorphic to Z. When m = 1, what we get fails to satisfy the condition "1 \neq 0" in (M3) in Stewart's definition of a ring. Authors differ as to whether to impose that condition. Leaving out that condition allows just one additional structure as a ring: the trivial ring, i.e., the structure with only a single element, which is both the 1 and the 0. What I consider the best choice is to allow this as a ring, but to exclude it from the definitions of field and integral domain by imposing the condition 1\neq 0 there. This choice seems implicit in what Stewart does, since on p.167 he constructs R/I for any ideal I, even though I = R would make R/I the trivial ring, while on p.166 line 4 he says Z_1 is not a field. However, in order to minimize the meddling I do with his definitions, I have not put such a change in definitions into the errata. ---------------------------------------------------------------------- You ask how one proves part 4 of Example 16.4, p.165, i.e., that in Q[t] not all elements have inverses. If a polynomial p(t) has an inverse q(t), then p(t)q(t) = 1, which has degree 0. Looking at the formula on p.20, first display, for the degree of a product, we see that this is impossible if p is taken to have degree > 0. ---------------------------------------------------------------------- You ask about Stewart's statement on p.166, item 10 near top, that Z_1 is not a field because it doesn't satisfy the condition 1 not-= 0 of (M3). Well, I think it was a mistake for him to include that condition in his definition of a ring, because it makes the statement "R/I is a ring for every ideal I" false for the ideal I=R; but I think it should be kept in the definitions of integral domain and field, because these are classes of rings with special important properties, and some of those properties are lost if we allow 1=0. (Note that if 1=0 in R, then multiplying by any x\in R, we get x = 0, so a ring with 1=0 has just one element, the zero element.) So we will continue to regard Z_1 as _not_ a field. Fortunately, the details of which definition the condition 1 not-= 0 is put in -- the definition of ring or the definitions of integral domain and field -- won't make for problems once we get into the meat of the course, because we will be dealing mainly with fields and polynomial rings over fields, so 1=0 will indeed hold. ---------------------------------------------------------------------- You ask whether in Case 1 on p.170, to prove that the ring of elements n* is isomorphic to Z_p, one should show that n* = m* <=> [n] = [m] in Z_p. That is a key step. One also has to show that they add and multiply like member of Z_p. The quickest way to do all this together is to notice that m |-> m* is a ring homomorphism, so its kernel is an ideal of Z, and recall from Math 113 that every ideal I of Z is principal, and that if (as in this case) it is not {0}, then it is generated by its smallest positive element. In this case, that is p, so the kernel of m |-> m* is pZ, so its image is isomorphic to Z/pZ. ---------------------------------------------------------------------- You ask why, in defining on p.172 the set S of ordered pairs (r,s) from which he will construct the field of fractions of R, Stewart requires s not-= 0. The idea behind the construction is that (r,s) is the element that will ultimately represent the fraction r/s, and we don't expect that fraction to make sense if s = 0. As for what would go wrong in the proof if we allowed s = 0, it is the verification that ~ is an equivalence relation. For every pair (r,s) we see from the equation defining the relation that (r,s) ~ (0,0), hence for any two pairs (r,s) and (r',s') we have (r,s) ~ (0,0) ~ (r',s'), but in general (r,s) ~ (r',s') will not be true. You also ask about the need for R to be an integral domain in verifying that the maps Stewart defines are operations (point 2 on his "checklist). That condition is needed because otherwise in a product such as [r,s][t,u] = [rt,su], the term su could be zero, which as just noted, we can't allow. ---------------------------------------------------------------------- You ask how the construction of the field of fractions of an integral domain (Theorem 16.16, p.172) shows that every integral domain is isomorphic to a subring of some field. The definition of "field of fractions" says that it has a subring R' isomorphic to R, which is the condition you ask about. (In the proof of the theorem, this comes out as point 4 at the bottom of p.172: the map described is a monomorphism, and any monomorphism gives an isomorphism between its domain and its image.) ---------------------------------------------------------------------- You ask in connection with the construction of fields of fractions (p.172) whether every field K can be expressed as the field of fractions of a proper subring A. This is a subtle question. If K is finite, or even an infinite union of finite subfields, then one cannot. In other cases one can, but to prove this in general requires tools beyond this course. For instance there is no obvious way of finding a proper subring A whose field of fractions is the field of real numbers. One can start by saying "We want A \intersect Q to be Z", and try to construct such an A by finding a set of elements which when adjoined to Q gives R, and then adjoining them to Z instead; but we have to be very careful in how we choose these elements so that we don't get all of R in the process. It requires using something called the Axiom of Choice (discussed in Math 135) together with a lot of careful algebra. ---------------------------------------------------------------------- You ask "Do we think of a set of rational expressions as making up a field?" Yes! Stewart is a little vague in Chapter 4 about what "rational expressions" are, but in Chapter 16 he is quite precise. On p.175, at the beginning of the long final paragraph, he explicitly says what he will mean by the phrase. ---------------------------------------------------------------------- You ask about the identification of K with iota(K) in the paragraph above Theorem 17.3 on p.179, saying "Is this just because iota is an inclusion, ... ?" It isn't literally an inclusion. To "identify K with iota(K)" means to pretend it is an inclusion, so as to keep the picture easier to understand; in essence, to use the same symbols for elements of K and their images in K[t]/I and think of them as the same. But for each c\in K, iota(c) is the equivalence class I+. ---------------------------------------------------------------------- Concerning the definition of a splitting field, p.181, you write that haven't been able to think of an example of a splitting field of a polynomial that is not in the complex numbers or Z_p. What kind of an answer to give you depends on exactly what you want; in particular, whether it is the base field that you don't want to be a subfield of C or Z_p, or the extension field, and whether you literally mean "in", or mean "related to". Every field either has characteristic 0 or characteristic a prime, and fields with characteristic 0 look similar to subfields of the complex numbers, while fields of characteristic p all have Z_p as prime field; so in that sense, you can't get away from those two cases. However, an example of a characteristic 0 splitting field where the fields themselves do not lie in the complex numbers is the extension L:K studied in sections 8.7 and 8.8 (defined on p.95, last display), where both fields contain the field of complex numbers, but don't lie inside it. Note that L is generated over K by t_1,...,t_n, which are to roots of the "general polynomial", so it is the splitting field of that polynomial. One can do exactly the same construction starting with any field, including Z_p, in place of C. An example which is similar in that it uses rational function fields is the "t^p - x^p" one that I gave in class, which is similar to Example 17.16 in tomorrow's reading. For an example where the base field is Z_p, but the extension field is not of the form Z_p, one can take any irreducible polynomial over Z_p (e.g., t^2 + t + 1 when p = 2, or t^2 + 1 when p = 3) and construct its splitting field by the "K[t]/" method. (Exercise 16.6 not put in the homework is the first of these examples.) Finally, if you're just concerned with the extension field not literally being inside the complex numbers, even though it may be isomorphic to a subfield of the complex numbers, the example I gave in class of a square root of 3 in the 2x2 matrix ring over Q, and the field that it generates, which is isomorphic to Q(sqrt 3), would do. ---------------------------------------------------------------------- You ask about the fact that Stewart seems to define the concept of "adjoining" elements to a field twice within a few pages: in the third line of section 17.2 on p.178, and in Definition 17.7 on p.181. The two situations are not the same. In the former, we are given elements of an extension field L of K, and "adjoining" them gives a field K(X) containing K and X and contained in L. In the latter, we are not given a field containing K; rather, we build one, either as the field of fractions K(t) of the polynomial ring over K, or as a factor ring K[t]/. But, the latter case "retrospectively" takes the form of the former, in that the extension field K or K[t]/ turns out to be generated over K by the new element, t or t + ; so it is reasonable to use the common term "adjoin" for both constructions. ---------------------------------------------------------------------- You ask about the statement in the bottom paragraph of p.181 that splitting fields are unique up to isomorphism. The justification is in the next sentence: Stewart observes that the proof is the same as that of Theorem 9.6. So the reader is expected to go back to that theorem and read through the proof again, making sure that every step is valid for a general field, not just a subfield of the complex numbers. I am not enthusiastic about Stewart's leaving this verification to the reader in an undergraduate text; but it is logically valid. ---------------------------------------------------------------------- You ask for examples of distinct splitting fields of the same polynomial (p.181, bottom), and why splitting fields in C are unique. Three examples of splitting fields over Q of t^2 - 2 are (i) the subfield of C generated by sqrt 2, (ii) the field Q[t]/, and (iii) the ring of 2 x 2 matrices over Q generated by the scalar matrices and the matrix with 2 in the upper right corner, 1 in the lower left, and 0 in the other two corners. As to why splitting fields are unique in C -- if K is a field, f an irreducible polynomial over K, and L an extension of K in which f splits, then f has a unique splitting field of K in L, namely the subfield of L generated by all the zeroes of f in L. So it's nothing special about C; it's just the fact that we are restricting the zeroes to all lie in a specified field, rather than allowing them to be taken in different fields. ---------------------------------------------------------------------- In your question about Lemma 17.4, p.183, you say that if the characteristic of k is p > 0, then k is isomorphic to Z_p. Not so! The elements of k obtained by adding 1 to itself repeatedly form a subfield isomorphic to Z_p, called the prime subfield of k (Def.16.8, p.169), but this is not in general all of k. We have seen examples where it is not. The easiest to describe are the fields of rational functions in one or more indeterminates over Z_p, i.e., Z_p(t), Z_p(t_1, t_2), etc.. Others can be obtained by adjoining to Z_p zeroes of irreducible polynomials, as in the present homework assignment. ---------------------------------------------------------------------- Regarding the top line on p.185 you ask why one has unique factorization. By Theorem 3.16, p.38. As with so many other results, Stewart proves this for polynomials over subfields of C, but the proof he gives works over any field, so he uses it in this later section without that restriction. ---------------------------------------------------------------------- You ask about the last two sentences of the proof of Lemma 17.21, p.186. We have seen that alpha has minimal polynomial g(t^p), while alpha^p is a zero of g(t). So if we let n be the degree of g(t), the minimal polynomial of alpha over K has degree pn, while the minimal polynomial of alpha^p over K has degree _< n, which is smaller than pn. So [K(alpha):K] < [K(alpha^p):K], so K(alpha) and K(alpha^p) can't be the same field. ---------------------------------------------------------------------- You ask about the "it suffices" statement at the top of p.187. If we prove the set of elements separable over K is closed under the operations listed, then it will be a subfield of L. It clearly contains K, and by assumption, it contains a set S of elements that generate L over K, so it will contain the field K(S) = L, meaning that L is separable over K, which is what we are trying to prove. ---------------------------------------------------------------------- You ask whether Proposition 17.18, p.185, can be proved algebraically, without the use of formal derivatives. Although the derivative of calculus is an analytic/topological notion, the formal derivative of a polynomial over an arbitrary field is an algebraic one -- it is defined and the properties we use are proved without using limits, even though it is inspired by the analytic concept. And it is a very powerful algebraic tool! My understanding is that no way is known to get the corresponding results without it. Incidentally, though the factorization properties of integers and of polynomials over a field are very similar, the formal derivative provides a way of telling whether a polynomial has a multiple factor, but no analogous method is known that works for integers. (So a polynomial of degree 100 can quickly be tested for multiple factors, but a 100-digit integer cannot.) ---------------------------------------------------------------------- You ask whether the polynomial t^6 - t^3 + 1 over Z_3 (p.190, last line) is separable. It factors as (t^2 - t + 1)^3 = (t+1)^6. By Stewart's Definition 17.19 it is separable, since all the irreducible factors are separable. By everyone else's definition it is inseparable, because it has multiple roots. That's why I left out that question. ---------------------------------------------------------------------- You ask why the second line of p.193, starting with "Then", follows from the first. I guess you mean the second sentence, since the second line contains no assertions, just the definition M = K(t,u). The sentence asserts that t and u are independent transcendentals over K and then on the next few lines makes a finiteness assertion. Here I will guess that it is the former that you are asking about. This follows from part (ii) of the lemma I put on the board on Monday, which I noted was used by Stewart without explicit statement, and both parts of which are implicitly proved in my proof of the Steinitz Exchange Lemma. Statement (i) is that a family of elements are independent transcendentals over K, and we bring in one more element that is transcendental over the field generated by those elements, then the resulting enlarged family is again a family of independent transcendental elements over K. In this case, the "given family" has just the one element t, and the additional element is u. ---------------------------------------------------------------------- > Bottom of 193 to top of 194 > > I'm a bit confused as to the point of Stewart's comments "now > interpreted as elements of K(\alpha_{1}, . . . \alpha_{n})" and the > point that we are evaluating at t_{i} = \alpha_{i}. What's the point > of these comments? ... I'm not really sure. Probably the fact that earlier, the symmetric polynomials we considered were elements of k[t_1,...,t_n] lying within the field L of section 8.7; but here we are looking at the result of substituting the alpha's into those polynomials, rather than having the polynomials themselves as elements of our field. (Though in the first half of section 18.3, he will again have the polynomials as elements of his field!) ---------------------------------------------------------------------- You ask what is involved in generalizing Exercise 8.4 to an arbitrary field, as called for in the proof of Theorem 18.8, p.194. Nothing -- the same proof applies! Stewart says "generalized" simply because the exercise was originally stated in a chapter where all fields were assumed to be subfields of the complex numbers, and he now wants to use it without that assumption. ---------------------------------------------------------------------- You ask why, on p.195, paragraph after 1st display, the fixed field of S_n contains the symmetric polynomials in the t_i. That is the definition of "symmetric polynomial" -- a polynomial that is unchanged by all permutations of the indeterminates, i.e., by the action on S_n. ---------------------------------------------------------------------- You ask why $f(t_n)=0$ holds in the fifth line of the proof of Lemma 18.9, p.195. I hope that my version of that proof in class made this clear: f(t) is what one gets when one expands the product (t-t_1)...(t-t_n), hence it has all of t_1,...,t_n as zeroes. (Stewart goes through this in section 18.2.) ---------------------------------------------------------------------- You ask how Stewart gets the inequality [K(s_1,...,s_n,t_n) : K(s_1,...,s_n)] _< n on p.195. If we call K(s_1,...,s_n) "M", this says [M(t_n):M] _< n, which means that t_n satisfies a polynomial of degree _< n over M. And that is what Stewart has just proved, using the polynomial f. ---------------------------------------------------------------------- You ask how one would come up with the relation s_j = t_n s'_{j-1} + s'_j given on page 195, 3rd to last display. Well, s_j is a symmetric polynomial in t_1,...,t_n, but one wants to see what form it takes when we separate t_n from the other t's. s_j will still be symmetric in those other t's, and it's degree in t_n is 1, so it can be written as [?]t_n + [?], where the two "[?]"s must be symmetric polynomials in t_1,...,t_{n-1}, since s_j is symmetric in those elements. So we look at what form the coefficient of t_n in s_j has, and what form the set of monomials not involving t_j have; we find that the former equals s'_{j-1}, and the latter, s'_j. ---------------------------------------------------------------------- You ask what, in the proof of Lemma 18.11, p.196, ensures that $(t_1,\ldots,t_n)$ are independent transcendental elements. The words "With the above notation" in the statement of the Lemma, where "the above notation" (same phrase used in Lemma 18.9) means the assumptions of the first paragraph of the section. ---------------------------------------------------------------------- You ask about Stewart's putting ``over'' in quotation marks in the statement of Definition 18.12, p.196. Usually, a polynomial over a field means a polynomial with coefficients in that field. But that is a case of the more general use of "over" to mean "constructed by building something on top of". In this case, the usage fits the wider sense but not the narrow sense; so Stewart uses quotation marks to signal that there is a deviation from what one would expect the term to mean based on the usual usage for polynomials. ---------------------------------------------------------------------- You ask why the extension \Sigma : K(s_1,...s_n) on the 4th line of p.197 is separable. This is easy to see using the development I gave in class: by the uniqueness of splitting fields, it is isomorphic to the extension discussed from the beginning of the section to the middle of p.196, which is separable because the zeroes of the defining polynomial are the independent (hence, distinct) indeterminates t_1,...,t_n. Using Stewart's development, I guess one has to argue the point separately: If it were not separable, then t_1,...,t_n would not be distinct, hence the transcendence degree of K(t_1,...,t_n) would be < n, from which one could get a contradiction using Theorem 18.13. ---------------------------------------------------------------------- You ask in connection with the last sentence of p.197 why Hilbert's Theorem 90 wasn't called Hilbert's Theorem 93. It wasn't named after the year! There were a large number of numbered results in his report, and this was the one he numbered "Theorem 90". Evidently it stood out as important, but there wasn't any obvious descriptive name to apply to it, so people started referring to it by way it was numbered in his report. (Where Stewart says "its appearance" he should have said something like "its designation".) Whether it was the 90th theorem, or whether he used a common numbering system for Definitions, Theorems, Lemmas, Remarks etc., so that it was merely 90th among this larger collection, I don't know. ---------------------------------------------------------------------- You ask why, as stated on the line after the last display on p.199, K(alpha) is the splitting field of t^p - a. Each of the elements tau^j(alpha) = epsilon^-j alpha will be a zero of that polynomial because they are images of one such zero under members of the Galois group. Since there are p distinct such elements, they form all the zeros of that polynomial. They all lie in K(alpha) because of the assumption epsilon\in K, and they generate K(alpha) over K since alpha is one of them; so K(alpha) is the field generated by all the zeros of t^p - a, i.e., the splitting field of that polynomial. (When you saw this statement in Stewart, you should have asked yourself "What has to be true for the assertion to hold?", and then you would hopefully have seen that two things were needed: That p distinct zeros of the polynomial be contained in K(alpha), and that they generate it. You could then have asked yourself whether you could verify either property; and if you got stuck, you could hopefully have at least made your question narrower, e.g., "How do we know that all the zeros of t^p - a lie in K(alpha)?") ---------------------------------------------------------------------- > Page 200, Paragraph above the end of proof box. > > How do we know that \phi^{-1}(H) is going to be normal in \Gamma(N:M)? > I suppose this boils down to the question, why does [P:M] = p imply > that P is normal, using Stewart's notation. [P:M] = p doesn't imply that P is normal; look again at Stewart's proof and you'll see that he gets those two facts by applying two different parts of Theorem 17.23 to the two results of the preceding sentence. It is those results that you need to see the justification of. I hope what I said in class provided it, namely that in the situation in question, phi is an isomorphism, so under phi, Gamma(N:M) "looks just like" G; in particular, starting with the normal subgroup H of index p in G, we get a normal subgroup of index p in Gamma(N:M) by applying phi^-1. ---------------------------------------------------------------------- You ask why, as stated in the 2nd sentence of the 2nd paragraph of the proof of Lemma 19.2, p.213, "to prove the lemma it is sufficient to show that given (0,x) and (0,y) we can construct (0,x+y), (0,x-y), (0,xy) and (0,x/y) ...". Well, if we can do that, then the set of values of x such that we can construct (0,x) by the operations of transferring the coordinates of points of P to the y-axis, and then applying the operations listed above, will be a field, namely the field generated by the coordinates of all points of P. As Stewart notes in the preceding sentence, given two points (0,x) and (0,y) on the y-axis, we can construct (x,y). So for every (x,y) in the field generated by the coordinates of points of P we can construct (x,y), as claimed in the lemma. ---------------------------------------------------------------------- You ask about the geometric constructions for multiplication and division (p.213). Since time is short, and no one else asked about these, I probably won't get to them in class, so you should try to work through the constructions in the book, and e-mail me or come to office hours if you have questions. What you write suggests that you are thinking of (x,0) |-> (1/x, 0) and (x,0), (y,0) |-> (xy, 0) as the basic constructions. In the abstract, these are the natural ones to start with, but the method that Stewart shows instead begins with the operation (x,0), (y,0) |-> (x/y, 0), because it happens to be very geometrically elegant. Once one has this operation, one can apply it with x = 1 to get (y,0) |-> (1/y,0), and then get multiplication via division: (x,0), (1/y,0) |-> (x/(1/y), 0) = (xy,0). So everything comes down to how to get (x,0), (y,0) |-> (x/y, 0). This is illustrated in Fig.19.7, bottom of p.213. Can you follow it? ---------------------------------------------------------------------- You ask about the phrase "some suitable set of points" in the statement of Lemma 19.3, p.214. From Lemma 19.2 we know that as long as a set P of points contains (0,0) and (1,0), we can get from P any other point whose coordinates lie in the field generated by the coordinates of the members of P. This means that there is a lot of freedom in what point-set we start with. If alpha is a zero of t^2 + pt + q as in the proof, then any set P which contains (0,0) and (1,0), and such that the field generated by the coordinates includes p and q, will do. ---------------------------------------------------------------------- Regarding the proof of Lemma 19.3 on p.214 you ask > ... if we can construct (0,sqrt(k)) how can we construct > something in K(alpha)? What Stewart says is, "... if we can construct (0,sqrt(k)) for any positive k\in K ...". Assuming alpha has the form shown in the display in that proof, the k that one needs to take the square root of is p^2 - 4q. Using the square root of this element, and the element p of K, and the operations of Lemma 19.2, one can then get alpha. ---------------------------------------------------------------------- You ask about the Intersecting Chords Theorem, mentioned on p.214, middle of page. I mentioned it in my Friday preview of this reading: It says that if you draw two chords in a circle, AB and CD, which intersect in a point X inside the circle, then AX . XB = CX . XD (products of lengths). To see its use in Figure 19.8, let AB be the line from (-1,0) to (k,0), and (C,D) be the vertical line through the origin, with C and D the points above and below the axis where this line crosses the circle. ---------------------------------------------------------------------- You ask whether the converse to Theorem 19.4 (p.214) is true. Right! This was shown in the proof of Theorem 7.4, though it wasn't put into the statement of that Theorem. ---------------------------------------------------------------------- You ask about the significance of the table of powers of 3 mod 17 on p.220. I previewed this development last time precisely because I felt it would be incomprehensible without motivation. What I showed was that if we write epsilon for a primitive 17th root of 1, each automorphism of Q(epsilon):Q took epsilon to epsilon^m for some m unique modulo 17, and that for every nonzero m\in Z_17, there was such an automorphism. I showed that composition of these automorphisms corresponds to multiplication of nonzero elements of the ring Z_17, and indicated that experimentation showed that the automorphism tau taking epsilon to epsilon^3 generated the whole group. This experimentation consists of computing the powers of that automorphism, which in view of the composition law described above, corresponds to computing the powers of 3 mod 17, which Stewart does. He then puts together elements x_1, x_2 which will each be invariant under tau^2, and elements y_1,...,y_4 which will each be invariant under tau^4, using that table. ---------------------------------------------------------------------- You ask how a geometric construction follows from the expression for cos theta (i.e., cos 2 pi/17) in terms of square roots on p.222. Lemmas 19.2 and 19.3 show how a line segment of length indicated by that expression can be constructed. As I pointed out in class, from the cosine of an angle one can construct the angle, and from an angle of 2 pi / n one can construct a regular n-gon. ---------------------------------------------------------------------- You ask how the notation e(G) (p.229) would be used in the case of an infinite group G. An infinite group can have finite or infinite exponent. For instance, if F is a field of characteristic p, then the additive group of the polynomial ring F[t] is infinite, but all elements of that group have finite order, and the l.c.m. of those orders is p; so e(F[t]) = p. On the other hand, if a group either has elements of infinite order, or elements of infinitely many different finite orders, then one has e(G) = infinity. ---------------------------------------------------------------------- You ask about Stewart's calculation Df = -1 in the proof of Theorem 20.2, p.228. It is correct. As you say, one has Df = q t^{q-1} - 1; but the characteristic is p and q is a power of p, hence a multiple of p, so as a coefficient, it equals 0. ---------------------------------------------------------------------- You ask whether Stewart is implicitly using Theorem 14.15 in the proof of Lemma 20.6, p.229. No. Theorem 14.15 only says that if a prime divides the order of a group there will be an element of that prime order; it doesn't give us any information about powers of primes. (For instance, S_4 has order divisible by 2^3, but it has no element of that order.) What Stewart is calling on here is the _definition_ of e(G), as an l.c.m., and the properties of l.c.m.s: if the l.c.m. of a set of integers is divisible by p^n, then one of those integers must be divisible by p^n. ---------------------------------------------------------------------- Regarding Example 20.10.2 on p.230 you ask > How do we know GF(25) can be constructed as a splitting field for > t^2-2 over Z_5 beforehand? We know that any extension of Z_5 having degree 2 will be a field of 25 elements, hence by the uniqueness Stewart has proved, will be isomorphic to GF(25). It takes a few seconds of calculation to see that 2 is not a square in Z_5; from that we can see that the splitting field of t^2-2 will be an extension of degree 2. > Do we generally construct GF(p^n) as a splitting field for t^n-x > over Z_p, for any x\in Z_p? We can obtain it by whatever method we find convenient, given our knowledge of field theory. > Then what about GF(p^p)? Obviously, in this case we want a splitting field of an irreducible polynomial over Z_p of degree p. You saw in homework one very convenient sort of polynomial of that form. ---------------------------------------------------------------------- You ask about Stewart's reference to the 11th root of 1 as a primitive 11th root of unity on p.233. What Stewart is doing on this page is criticizing the approach he has taken so far, which has accepted roots of unity as "radicals" because they satisfy equations t^n - a, namely with a = 1. Ordinarily, one can think of a solution to t^n - a as "the nth root of a", but Stewart points out that if one denoted zeta_11 by the symbol "11th root of 1", the obvious interpretation of that symbol would be "1", which is not a primitive 11th root of unity. The reason for the difference is that t^11 - 1 is not irreducible, so that when we adjoin a zero of that equation to a field, we have to distinguish which irreducible factor of t^11 - 1 it is a zero of, t-1 or t^10+...+t+1. Only the latter gives a primitive 11th root of unity. In this informal motivational discussion Stewart is playing on the ambiguity of the symbol "nth root of a". For n a positive real number, that symbol has the precise meaning of the positive real solution of t^n - a = 0, but there is also the loose sense of "any solution of t^n - a = 0". He is implicitly saying that the loose sense of the symbol might justify our thinking of zeta_11 as represented by the symbol "11th root of 1", but that the more precise sense does not. ---------------------------------------------------------------------- I hope I showed where the "rabbit out of the hat" on p.236 came from. As for why Stewart "pulls these things out of nowhere and then gives no explanation", my guess is that he feels that this is what math texts do in general, and that he is just being more honest and pointing it out when he does it. Unlike him, I think that many of the techniques we use can be successfully motivated. Perhaps he feels that students can gain most by analyzing the techniques used after seeing them, rather than having the explanations handed to them on a platter. And perhaps there are some differences between the English and American educational systems that make this more valid there than here. I don't know. ---------------------------------------------------------------------- You ask about the "Z_4 quotient group" referred to on p.239, line 4. That should be "subgroup", not "quotient group" -- thanks for pointing it out! As I discussed in class, the Chinese Remainder Theorem shows that Z_20 is isomorphic to the direct product of Z_5 and Z_4 as rings, hence the group of its invertible elements is isomorphic to the direct product of the groups of invertible elements of those two fields, which have the forms Z_4 and Z_2. Since the subgroup Z_4 comes from the ring Z_5, and consists of those elements of the direct product with identity-element in their Z_2 component, it corresponds to the automorphisms that permute the primitive 5th roots of unity while fixing the primitive 4th roots of unity, +-i. So it corresponds to the Galois group of Q(i,zeta):Q(i), equivalently, Q(xi):Q(i). ---------------------------------------------------------------------- You ask whether the "k" in Definition 21.1, p.241, is a single value, or varies from one step to another. Good point. As you guessed, he means it varies from one step to another, though his wording doesn't make that clear. ---------------------------------------------------------------------- The three of you asked why, on p.242, first paragraph, we have Q(theta, zeta) = Q(theta zeta). The multiplicative orders of theta and zeta, namely p-1 and p are, as Stewart says on the first line, coprime. It follows that the order of their product is the product of their orders. (For elements of coprime orders in an arbitrary finite abelian group, I proved this in my lecture of April 22, in clarifying the proof of Lemma 20.6.) Hence the subgroup generated by both of them will be cyclic with that product as a generator. Hence Q(theta zeta) contains both of them, hence contains Q(theta, zeta). The reverse inclusion is immediate. ---------------------------------------------------------------------- You ask about the comment on p.243, 3rd and 4th lines below display (21.9), that the maximum radical degree is max(2,(p-1)/2). Well, in that remark p is assumed an odd prime, so p-1 is even, so (p-1)st roots can be expressed in terms of square roots and (p-1)/2-th roots. ---------------------------------------------------------------------- You ask where in the argument on p.246 Stewart uses the assumption m_{[\epsilon^{p}]}(t) \notequal m_{[\epsilon]}(t) made on line 7 of that page. In the middle of the page, where he says "Therefore, m_{[\epsilon^{p}]}(t) and m_{[\epsilon]}(t) have a common zero ... so that \bar{t^n - 1} has a repeated zero ...". If they were the same factor of \bar{t^n - 1}, having a zero in common wouldn't make that a repeated zero. The same argument could be done without contradiction. Without assumption as to whether these two divisors of \bar{t^p - 1} are distinct, one could show as he does that they have a common zero, hence since \bar{t^n - 1} has no multiple zeros, each zero is the zero of one factor only, hence these factors with a common zero must be the same. ---------------------------------------------------------------------- The question you handed in is from section 22.2, which we're not covering! However, here is the answer. You wanted to know how Stewart comes up with the elements phi and psi used on p.253. To build up the splitting field of f step by step, we want to construct first the fixed field of the normal subgroup A_3, and then from it the whole field. The fixed field of A_3 will consist of elements that are invariant under cyclic permutations of alpha_1, alpha_2, alpha_3, but not necessarily under other permutations. The easiest expressions to examine are polynomials in the alpha's. Note that any polynomial in a set of elements that is invariant under a group G of permutations will be a linear combination of "orbits" of monomials under G. Experiment, and you will find that any monomial of degree 1 or 2 will have the property that its orbit under A_3 is also invariant under S_3, so it will give an element of the base field rather than generating the extension we want. When we reach degree 3, we find that some of the monomials still have that property, but the monomial alpha_1^2 alpha_2 is sufficiently "asymmetric" so that taking its orbit under cyclic permutations doesn't completely symmetrize it. So we use its orbit, the sum of which Stewart calls phi. To simplify the argument that follows, he also throws in the sum psi of the orbit of alpha_1 alpha_2^2, the image of phi under any element of S_3 that is not in A_3. ---------------------------------------------------------------------- You ask whether for each n there is a formula for the discriminant of a polynomial f of degree n in terms of its coefficients, like the one Stewart gives for the cubic on p.256. Yes, because the discriminant is a symmetric polynomial in the zeroes of f (we saw that it is unchanged under permutation of the zeroes), and every symmetric polynomial is expressible in terms of the elementary symmetric polynomials (Exercise 8.4), which are given by the coefficients of f. ---------------------------------------------------------------------- You ask what Stewart means, on the line after the 3rd display on p.257, by "expand in powers of t". He means multiply out the big product (n! terms) and write the result as a polynomial in t; i.e., collect those terms that don't involve t and make their sum the constant term of the polynomial, collect those terms with a single factor t and make their sum the linear term of the polynomial, and so on, all the way to the t^{n!} term. ---------------------------------------------------------------------- > I had trouble with the proof of Thm 22.8 on page 258, especially the > third sentence below the first display in the proof. I didn't see why > Q_1 is one of the factors, ... Well, if you'd looked closely at the reason Stewart gives, you would have seen that he says "because y - beta divides H", and you would have asked "What is y?", and have been the first to discover another typo. That should be "t - beta". So: t - beta is one of the factors of H over Sigma by definition of H. Now the irreducible factors Q_1, ..., Q_k of Q, when looked at over Sigma, are gotten by collecting together different factors t - sigma_x(beta) in the definition of Q, as discussed in the next-to-last paragraph of p.258. By assumption, Q_1 is the one and only one of the Q_j's that has t - beta itself as a factor. So, since we have observed that H also has t - beta as a factor, Q_1 must be one of the Q_j's that H is the product of. ---------------------------------------------------------------------- You ask about the generalization of Gauss's Lemma that I remark (in the corrections) that Stewart implicitly calls on on p.258. That generalization is gotten by replacing the base ring "Z" in our statement of Gauss's lemma by any unique factorization domain, i.e., ring in which every element has a factorization into irreducible elements that is unique up to multiplication by invertible elements. The proof is virtually the same. Then one proves (with the help of that lemma, and the fact that polynomials over a field form a unique factorization domain) that for any unique factorization domain A, the polynomial ring A[t] is also a unique factorization domain. Then one verifies by induction on n that polynomials in n indeterminates over a field form a unique factorization domain, and hence that Gauss's lemma is applicable to them. The arguments are all elementary; this is regularly covered in Math 250A, and sometimes, I think, in Math 113 as well. Even if it isn't taught, it's likely to be in a 113 textbook. ---------------------------------------------------------------------- You ask about the feasibility of the factorizations that Stewart's algorithm (pp.257-258) calls for. I've never studied these computational questions in practical terms. But cf. Exercise "3.19" on the second homework assignment sheet. You also ask for an example of its application more complicated than the quadratic he gives at the end. I shudder at the thought! ---------------------------------------------------------------------- You ask whether there is an "easiest way" to compute bold-G (p.258) for polynomials of relatively high degree. The first problem is how to factor Q into irreducible Q_1 ... Q_k. Exercise "3.19" on the second homework assignment sheet shows that for base field the rational numbers, there is an algorithm for factoring polynomials in one variable. I would imagine that similar methods could be developed for polynomials in several variables; but what their level of computation difficulty would be I don't know. Once one has Q_1, it should be relatively easy to see which elements of S_n it is invariant under, thus giving bold-G -- unless, of course, n is really large, say around 20, so that Q_1 has billions of terms, filling up a giant database, in which case even that problem could require great ingenuity for a programmer. Although in that case, it's likely that one would get stuck before that stage in trying to factor Q, or even write it down. ---------------------------------------------------------------------- You ask how renumbering the roots changes Gamma(f) to a conjugate subgroup, as stated on p.251. I talked about this in class on Monday in connection with that day's reading. To be precise, it is the subgroup of S_n corresponding to Gamma(f) that is changed to a conjugate. I gave the example of the polynomial t^4 - 2. If we let alpha_1 = 2^{1/4}, alpha_2 = i 2^{1/4}, alpha_3 = -2^{1/4}, alpha_4 = -i 2^{1/4}, then the permutation (1234) belongs to the Galois group (it is the rotation sigma of Fig.13.1, p.183). But if we number them so that alpha_1 = 2^{1/4}, alpha_2 = -2^{1/4}, alpha_3 = i 2^{1/4}, alpha_4 = -i 2^{1/4}, then that automorphism is represented by (1324), while (1234) ceases to represent an automorphism, since an automorphism sending 2^{1/4} to -2^{1/4} must clearly send -2^{1/4} back to 2^{1/4}. ---------------------------------------------------------------------- You ask why, in the proof of Theorem 22.7, p.256, we have to require char K not-= 2. The author shows this explicitly in the proof, when he says "since char K not-= 2 we have delta not-= -delta." ---------------------------------------------------------------------- You're right that in the proof of Theorem 22.7, p.256, the question of separability needs to be considered in getting statements 1 and 3 of that theorem. However, this is taken care of if we start with the (trivial) proof of statement 2. That doesn't use Galois theory, and the result shows that if f is inseparable, then Delta(f) = 0. Hence in proving statement 1 only the separable case needs to be looked at, and in statement 3 the inseparable case is excluded by the assumption "Delta(f) not-= 0" that I added in the errata. ---------------------------------------------------------------------- You ask what Stewart means by "conjugate subgroups" and "conjugacy classes of subgroups" on pp.251-252. For any g\in G, "conjugation by g" means the map h |-> g^{-1} h g (or, depending on the author, g h g^{-1}. Since conjugation by g under one definition corresponds to conjugation by g^{-1} under the other, the two definitions lead to the same set of "conjugation maps", also called "inner automorphisms of G"). Subgroups H_1 and H_2 are then called conjugate if there is some g\in G such that H_2 = g^{-1} H_1 g. Being conjugate is an equivalence relation on subgroups, and the equivalence classes are called conjugacy classes of subgroups. (Being conjugate is likewise an equivalence relation on elements, and the equivalence classes are called conjugacy classes of elements.) ---------------------------------------------------------------------- You ask what is "a good way to get the Galois group of an irreducible polynomial". Certainly not the method of section 22.4, as Stewart indicates in the very first paragraph of that section, p.256. So far as this course is concerned: Study the algebraic properties of the zeros, and use whatever particular facts about these you can come up with. In connection with Exercise "21.15(b)", the hint suggests the particular facts to use.) ---------------------------------------------------------------------- You ask, in connection with the last paragraph of section 22.1, p.252, what the 5 conjugacy classes of subgroups of S_4 and of S_5 are. For S_4 you found them all: S_4, A_4, D_8, V, and the cyclic subgroup <(1234)>. (Three of these are normal; for the other two we have to say "the conjugacy class of ---".) For S_5, note that if G is such a subgroup, then for any x\in {1,2,3,4,5}, the orbit-size formula |G x| = [G:G_x] shows that |G| must be divisible by 5, hence by Cauchy's Theorem, G must contain an element of order 5, which must be a 5-cycle. By a conjugation, we can assume without loss of generality that this is (12345). So we only need to consider subgroups containing that element. See how far you can go from there in finding different subgroups. If you fall short of 5, I can show you how to finish. Incidentally, the argument used to show that any conjugacy class can be assumed to contain (12345) generalizes to any prime n, and is probably the reason for the fact mentioned by Stewart, that the number of conjugacy classes tends to be relatively small when n is prime. ---------------------------------------------------------------------- You ask why at the end of the proof of Theorem 22.7, p.256, we can say that G^dagger = K. By the Fundamental Theorem of Galois Theory! It's true that to apply this, we have to verify that f is separable. But that follows from statement 2, and the assumption I added to statement 3 in one of my errata, that Delta(f) is nonzero. ---------------------------------------------------------------------- In your answer to your pro forma question on the behavior of delta under elements of the Galois group (p.256), you say that a transposition changes the sign of delta because it changes the sign of exactly one factor. This is not quite true. A transposition (i, i+1) only changes the sign of (alpha_i - alpha_{i+1}), but a transposition (i, j) with j-i > 1 also changes the signs of all factors (alpha_i - alpha_k) and (alpha_k - alpha_j) for all k between i and j. However, the number of factors of the former sort equals the number of the latter sort, so their effects on the sign cancel out, so that (alpha_i - alpha_{i+1}), the one term whose sign is changed that is not involved in this cancellation, causes the sign of the whole product to change. ---------------------------------------------------------------------- You both ask about the need to study pi analytically, as on p.270. The only definition of pi that we have is analytic. It can be posed in various ways; the definition used implicitly in this section is that pi/2 is the smallest positive real number x making cos x = 0. The classical definition is as the ratio of the circumference to the diameter of a circle, but the easiest way to make this precise is to define the circumference as an integral, which is again an analytic definition. The analytic definitions do not lead to any algebraic characterization (I don't count exponentiation by a non-rational number as an algebraic operation) so the only methods we have to study it are analytic. (If its properties did lead to an algebraic characterization, this would make it an algebraic rather than a transcendental number.) ---------------------------------------------------------------------- You ask why J_n is an integer in the computation on p.270, given that the expression for it contains polynomials in pi/2. The conclusion that it is an integer is obtained under the assumption that pi is rational (after multiplying out by the appropriate power of the denominator of pi). This is used to get a contradiction. ---------------------------------------------------------------------- You ask whether the results of this reading (p.270) might be proved using the theory of transcendental extensions. I don't think so. That theory concerns the relations among transcendental elements, but doesn't give us a way to come up with transcendental elements to start with. ----------------------------------------------------------------------