You ask about the mathematical use of words like "field" and "ring"
(p.1).
I think that "field", "ring", "group" and "domain" are all cases where
mathematicians chose a general word meaning "a collection of things
that one can choose from", and gave it a technical meaning. I don't
think that any particular differences between the meanings of those
words motivated these choices; whoever came first got first pick.
So far as I know, all European languages use for "ring", "group" and
"(integral) domain" words having the same literal meanings as the
English ones; but for "field" they are split: English and Spanish
use the word meaning a (farmer's) field, while French and German use
the word for "body" (French "corps", German "Koerper"). For the
analogous structures but where multiplication may be noncommutative,
these languages use modified terms: "skew field", "corps gauche",
"Schiefkoerper". But Russian avoids this awkwardness: Someone must
have noticed the situation in Western languages, and cleverly assigned
their word for "field" (polya) to the commutative concept, and the word
for "body" (telo) to the one without an assumption of commutativity.
I have sometimes conjectured that the choice of the word "ring" was
a pun: A subset of C is a ring if and only if it is "closed" under
the appropriate operations. (On the other hand, Z/n can be thought
of as ring-like in a different way.)
----------------------------------------------------------------------
You ask whether writing "i = sqrt -1" as on p.2 isn't an abuse
of notation.
That's a good point, though I wouldn't exactly say it is an abuse
of notation: One can only abuse notation after one has decided what
the notation will be, and what you've pointed out is that we haven't
decided what "sqrt z" should mean when z is a complex number. There
are different possible choices. One can, for instance, show that every
complex number has a unique square root in the union of the open upper
half-plane with the set of nonnegative real numbers. If we decide to
call that "sqrt z", then "i = sqrt -1" is correct. In Math 185 one has
to make choices like this (defining a "principal value of the square
root function"). Here we are instead loosely using "sqrt z" to mean
"one of the square roots of z", with the understanding that once this
is chosen, the other square root of z will be written "- sqrt z".
----------------------------------------------------------------------
In connection with the discussion on pp.5-6, you ask whether there
are other important number systems than those Stewart mentions.
The answer depends on how widely one casts one's net. One can think
of all rings as "number systems", so that rings Z/n, polynomial rings,
rings of functions, rings of matrices, etc. are all examples.
On the other hand, if one restricts one's attention to structures that
"extend" the objects Stewart names in a natural way, there are much
fewer, and I'm afraid they are not, as you say, "indispensable in
mathematics". One is the ring of quaternions. As the complex numbers
are 2-dimensional over the real numbers, so the quaternions are
2-dimensional over the complex numbers; but they are not commutative.
They have a basis over R of 4 elements, 1, i, j, k, and a
peculiar multiplication where 1 acts "as one would expect",
each of i, j, k has square -1, while the product of two of them
that are not the same is just like their cross-product in physics.
The cross-product is not associative, but the multiplication of the
quaternions is, because of the products i^2 = j^2 = k^2 = -1 that
don't work like the cross-product. Quaternions are of some use in
geometry, but have not shown themselves nearly as important as the
real and complex numbers.
There is also an 8-dimensional structure, containing the quaternions,
called the octonions, which is not even associative, but still has the
property that the product of any two nonzero elements is nonzero.
These are less used even than the quaternions. And there things stop:
Outside of dimensions 1, 2, 4, 8, it has been proved that one can't
define any multiplication (bilinear map) on a finite-dimensional
vector-space over R such that a product of nonzero vectors is nonzero.
If one doesn't restrict oneself to extensions of R, or to
finite-dimensional extensions, then there is an endless bounty of
fascinating examples, but in general, each of these is just "one
among many".
I guess there is one set of class of rings that are of an importance
and "uniqueness" that puts them in a class with the reals and the
complexes, at least for number-theorists: For each prime number p
there is what is called "the ring of p-adic integers". But the
construction is not one of which I can give a thumbnail sketch; if you
ask in office hours, I can give a 10- or 15-minute explanation of
the idea.
----------------------------------------------------------------------
You ask how one would think of the approach to solving the cubic
that Stewart sketches on p.8.
I don't know how people arrived at it historically, but here is how
I would motivate it. Consider the solutions to the linear and the
quadratic equations. The former is just a constant that one calculates
by arithmetic, while the two solutions to the latter (as I noted the
first day of class) are one constant plus-and-minus another:
linear: r = rho
quadratic: r_1 = rho_0 + rho_1
r_2 = rho_0 - rho_1
(where the quadratic formula tells us in detail what rho_0
and rho_1 are).
Now +1 and -1 are the two square roots of 1. The three cube roots
of 1 are 1, omega and omega^2 (where omega is a primitive
cube root of 1), and fiddling around, one can conjecture the following
analogous formula:
cubic: r_1 = rho_0 + rho_1 + rho_2
r_2 = rho_0 + omega rho_1 + omega^2 rho_2
r_3 = rho_0 + omega^2 rho_1 + omega rho_2
I started to show on the first day of class (though I ran out of time)
that given a cubic t^3 + c_2 t^2 + c_1 t + c_0, if one assumes
it factors as (t - r_1) (t - r_2) (t - r_3) and substitutes in
the above values, one gets equations saying that 3 rho_0 = c_2, while
(rho_1)^3 and (rho_2)^3 satisfy a certain quadratic equation.
Solving that equation and taking cube roots, one gets values to use
for rho_1 and rho_2, and hence a solution to the cubic.
Having gone through these messy calculations, one can use hindsight to
simplify the calculation a bit. First, one sees that rho_0 is just
an added constant that doesn't affect the way the roots relate
to each other, so by changing variables to subtract this constant from
t, one can make rho_0 = 0, greatly simplifying the multiplications
one has to do in expanding (t - r_1) (t - r_2) (t - r_3). Since
1 + omega + omega^2 = 0, once rho_0 is eliminated the roots
r_1, r_2 and r+3 add up to 0; so getting rid of rho_0 is
equivalent to making the term c_2 zero. Also, we found that the
computation gave us formulas for the _cubes_ of rho_1 and rho_2,
i.e., rho_1 and rho_2 are gotten by computing certain cube roots;
so we may as well anticipate things by calling them cuberoot(u)
and cuberoot(v) when we introduce them.
Making these changes, the method I sketched turns into the method
Stewart sketched.
Hope this helps.
----------------------------------------------------------------------
You ask why one chooses \alpha and \beta near the bottom of p.9
so that 3\alpha\beta + p = 0.
\alpha and \beta are supposed to represent the cube root of u
and the cube root of v in equation (1.4); so 3\alpha\beta + p = 0
is needed for that equation to hold.
----------------------------------------------------------------------
You ask how Stewart gets the 9 expressions on the bottom of p.9.
The preceding computation has shown that any solution of the given
cubic will be the sum of a cube root of u and a cube root of v.
If we let alpha be one cube root of u, then the others will be
omega alpha and omega^2 alpha; similarly if beta is one cube
root of v, the others are omega beta and omega^2 beta. So all
the sums "a cube root of u plus a cube root of v" are all the
9 combinations Stewart shows.
----------------------------------------------------------------------
You ask about the choice of the three "good" expressions among the
nine expressions at the bottom of p.9.
Each is a sum of two terms, the first of which is supposed to be the
"cube root of u" in equation (1.4) on the preceding page, and the
second of which is supposed to be the "cube root of v" there. Suppose
alpha and beta are one pair of choices for the cube root which
satisfy (1.4). (We can choose these by letting alpha be any cube
root of u, and letting beta = -p / 3 alpha.) Then if we take a
different cube root of u for use in constructing a different root,
say omega alpha, then the cube root of v that goes with it, to still
satisfy (1.4), must be beta / omega. Since omega^3 = 1, we have
1/omega = omega^2, so we can write this cube root of v as
omega^2 beta. Similarly, for our third root we use omega^2 alpha +
omega beta.
----------------------------------------------------------------------
You ask regarding the discussion at the bottom of p.9, "... can one
find a formula for the roots of a cubic which only produces correct
roots?"
Yes. Take Cardano's formula, but in place of the second summand, put
-p /(3 times the first summand). This will insure that (1.4) is
satisfied, which is what Stewart is concerned with at the bottom of p.9.
(The fact that this element will be a cube root of v follows from
(1.6), which u and v were chosen to satisfy.) This is really what
Stewart does at the bottom of p.9. But the resulting expression, if
written in place of Cardano's formula, would lose the beautiful symmetry
of that formula.
----------------------------------------------------------------------
You ask about Stewart's statement on p.10, last full paragraph, that
Cardano's formula is pretty much useless when there are three real
roots.
I don't really know why he says that. Maybe the idea of using
complex numbers to find real numbers goes against the grain; but
it shouldn't to a mathematician ... .
----------------------------------------------------------------------
You ask about "permuting the roots" of a quintic in the discussion
on pp.12-13.
Stewart doesn't mean this discussion to be something he expects you
to follow in mathematical detail; he's just throwing around the ideas
to give you a sense of the history. (He will be more precise when we
actually start developing Galois theory! But it would be nice if
he made this basic idea of permuting the roots clearer.) What he
means is that we take some expression in the roots, such as the last
display on p.12 or any of the first four on p.13 (for the cubic, the
quartic, and the quintic respectively) and replace it by a new
expression in which each occurrence of a given root alpha_i is
replaced by some fixed alpha_{i'}. For instance, in the last display
on p.12, one permutation would replace alpha_1 by alpha_2, alpha_2
by alpha_3, and alpha_3 by alpha_1. One discovers that the effect
is to multiply the sum (alpha_1 + omega alpha_2 + omega^2 alpha_3) by
omega^2, hence the expression shown, the cube of that sum, is left
unchanged. On the other hand, if we interchange alpha_2 and alpha_3
but leave alpha_1 unmoved, the expression on the bottom of p.12
turns into the one on the top of p.13; and we find that every
permutation has one of these two effects, i.e., the original expression
is always turned into one of a list of two expressions by a permutation
of the roots.
The expression Stewart gives on p.13 for the quartic is, as I indicated
in class, incorrect. Permutations of the roots turn that expression
into six different expressions; but they turn the corrected expression
that I wrote on the board, (alpha_1 + alpha_2 - alpha_3 - alpha_4)^2
into only three different expressions, which allows one to reduce the
quartic to a cubic (but how they allow this is something we won't go
into).
----------------------------------------------------------------------
You ask about the explicit solution of a particular quintic referred
to on the bottom of p.13.
I don't know the details -- Stewart doesn't claim you should be able
to see it; on the contrary, he gives a reference, implicitly encouraging
any reader who is interested to look up the paper named. (When a piece
of mathematical writing contains a phrase like "see Berndt, Spearman
and Williams (2002)", this means that you will find in the bibliography
an item identifiable by that author-list and that date. The
bibliography of this book is on pp.279-282, and this reference is on
the first page of that bibliography.)
----------------------------------------------------------------------
You ask why Exercise 1.10, p.15 says that to try to generalize
Bombelli's observation (paragraph below bottom on p.10) is
"usually pointless".
I really don't know what Stewart had in mind. The best I've been able
to come up with is that that when that method is applicable, Cardano's
formula gives the answer 2 alpha. So assuming alpha was to be
taken rational, an analog of Bombelli's method could only work when the
cubic had a rational root, which it usually doesn't. However, that
doesn't explain why Stewart begins "When 27 q^2 + 4p^3 < 0".
----------------------------------------------------------------------
You ask, regarding the proof of the Fundamental Theorem of Algebra
on p.25, "How did Gauss choose \gamma(\theta) = p( r(\epsilon) *
e^(i\theta))/ (r(\epsilon)^n + 1)?"
Gauss didn't! Stewart says on the middle of p.22 that "The ideas
behind the proof we give here (but not their precise expression) go
back to Gauss". The idea is one that I described in class in
previewing that reading on Friday: Look at p as giving a function
C --> C; note that if you traverse a circle around the origin
in the domain plane, its image will be some sort of loop in C.
Moreover, if the circle has very large radius, so that the values you
substitute for t have large absolute value, then the t^n term
will be much larger than the lower-exponent terms in p, so the loop
you get will look very much like t^n. But from the properties of
multiplication in C one can see that as t goes once around the
origin, t^n goes around n times; so the same will be true for
p(t) when the circle is large. On the other hand, when the circle
is very small, p(t) will move very little, and in general won't go
around the origin at all. Now if p had no roots, then as we gradually
expanded the circle we considered, none of the intermediate circles
would cut through the origin, so the number of times those circles
went around the origin (their winding numbers) would stay constant as
we increased the radius. This would give n = 0, a contradiction.
Stewart's contribution is to take these larger-and-larger loops,
which can be described as p(r e^{i theta}) for various radii r,
and "scale them down" so that they don't go to infinity, by dividing by
r^n+1, and also to make the process where the domain circle goes to
infinity take place in "finite time" by letting r = epsilon / 1 -
epsilon. That gives the formula you ask about.
----------------------------------------------------------------------
You write that you don't see why the display on p.25 saying
two limits are equal is true.
The author substitutes for gamma_epsilon(theta) the expression by
which it was defined earlier on the page. Moreover, one sees that
in that expression, the two places where epsilon appears, it is
in the expression r(epsilon); so to understand that limit as
epsilon --> 1, one just has to think of what r(epsilon) does as
epsilon --> 1. What it does is approach infinity; so Stewart writes
the limit as one as "r(epsilon) --> 1". One could quibble about this
notation, since r(epsilon) is not an independent variable; but I
hope that with this explanation, you see what he means.
----------------------------------------------------------------------
You ask about Stewart's reference to uniform convergence in the
sentence after the last display on p.25.
Well, that forced me to think it through, and the reason he gives
is wrong -- the fact that theta ranges over a closed (hence compact)
interval [0,2pi] does not force the convergence to be uniform (a
concept from Math 104, which is not a prerequisite for this course).
Rather, one needs to simply work through the preceding calculation more
carefully to verify that it shows continuity of gamma_epsilon(theta)
as a function of two variables, and hence gives the equality of winding
numbers.
You asked whether it isn't sufficient just to show continuity in
epsilon and theta separately. It is not. For instance, the
function f(x,y) defined to equal xy/(x^2 + y^2) when (x,y) is
not (0,0), and to equal 0 at that point, is continuous in x,
and continuous in y, but is not continuous as a function of two
variables, as can be seen by looking at its behavior on the line
y=x, where it is 1/2 everywhere except at the origin, but 0 there.
(That's an example one usually sees in Math 53.)
However, don't worry -- because this step is essentially a
Math 104/Math 185 argument, you are not responsible for it in this
course. As we get close to the first midterm, I'll have to decide
what, if anything, you are responsible for from this section; but it
won't be much beyond the fact that the Fundamental Theorem of Algebra
holds.
----------------------------------------------------------------------
You ask whether, in the proof of Lemma 3.5, p.34, there shouldn't be a
largest k such that kd divides f and g.
If our base ring were the integers, this would be true, but over
a field K, every nonzero element has an inverse; so if a polynomial
a divides another polynomial b, say with b = ac, then for every
nonzero field-element k the polynomial ka will also divide b,
since b = (ka) (k^{-1}b).
----------------------------------------------------------------------
You ask about Stewart's choice of definition of "irreducible"
on p.36, under which 6t+3 is considered "irreducible".
I am unhappy with his choice of definition, but I can see the
point of it: He wants "reducibility" to mean that the polynomial
can be broken into factors each of which can contribute a nonempty
set of roots -- if not over the given field, then possibly over some
larger field. If you factor 6t+3 as 6 . (t + 1/2), the factor
"6" can't contribute any roots, so he doesn't want this to count as
a "reduction".
----------------------------------------------------------------------
In connection with the first sentence on p.39, you ask about the
feasibility of using computers to factor polynomials.
It's a subject I know nothing about. On the latest (=2nd) homework
sheet, in the two unassigned "Not from Stewart" problems, I show how to
estimate the size of the coefficients of possible factors. You could
work out the size of the bounds obtained, and hence the number of
possible factors that would have to be checked. But as to how these
can be improved on by more careful arguments, what other methods can
serve to cut down the set of factors that need to be checked, and
whether people have written software using these ideas, I don't know.
----------------------------------------------------------------------
You ask about the basis for the statement on p.40, next-to-last
line of section 3.3, "Clearly, a_1 ... a_s = 1".
The polynomial f has two factorizations, one as g_1 ... g_n
and another which Stewart has been proved can be written
(a_1 g_1) ... (a_n g_n) after showing that the number of terms,
originally called s, is equal to n. This shows that g_1 ... g_n =
(a_1 g_1) ... (a_n g_n); hence cancelling the g's, 1 = a_1 ... a_n.
Since, as mentioned, n = s, this can be written a_1 ... a_n = 1.
----------------------------------------------------------------------
You ask how one can tell from Figure 3.1, p.44 that B and C are
multiple zeroes of the polynomial.
From the fact that they are zeroes where the derivative is also zero.
Note that if f(t) has a zero at t=a, we may write it as (t-a)g(t).
Now if we differentiate this and set t = a, the result comes to
g(a). (Check this!) Hence if f(t) just has the single zero at t=a
coming from the factor t-a, i.e., if g(t) does not also have a zero
there, the derivative g(a) is nonzero. This happens at point A,
but at points B and C the curve is horizontal, so the derivative
is 0, so when we factor f(t) in this way, g(a) must be zero, so
f(t) has a multiple zero at a. It is also easy to show that if a
real-valued polynomial has a zero of odd multiplicity at t=a, then
its graph crosses the axis there, while if it has a zero of even
multiplicity, its graph bounces off the axis (it has the same sign on
both sides of t=a). So we see that f(t) has a zero of even multiplicity
at B, and a zero of odd multiplicity >1 at C.
The graphs of y = t, y = t^2 and y = t^3 are familiar examples
of these kinds of behavior.
----------------------------------------------------------------------
You ask about Stewart's definition on p.50 of a field extension as a
certain kind of function (a monomorphism), rather than referring to
the fields themselves.
Good question! Conceptually, I would say a field extension consists
of two fields K and L together with a connecting monomorphism
iota: K --> L. Moreover, in the situations we will be considering, K
will most often by given (it will be the field containing the polynomial
whose roots we want to adjoin) so the focus will be on L, the field
we get by adjoining those roots. So we will often think of L as an
extension of the field K.
But since a function is considered to determine its domain and codomain,
the map iota determines K and L, so many authors, including
Stewart, choose for the sake of economy to define the extension as
just being that map. The choice is logically satisfactory (everything
one wants to say about K, L and iota can be stated in terms of
iota, since K and L can be described as its domain and codomain),
but I dislike it pedagogically, since the definition doesn't match
the way we think of the thing.
----------------------------------------------------------------------
You write
> On p.50, line after Example 4.2, it says that we can
> usually identify K with its image iota(K). When would
> this not be legitimate?
Remember when I talked in class about two "stories", one about
Mr. Smith meeting Miss Jones in New York and having a hamburger with
her, and the other about Mr. Nagata meeting Miss Kobayashi in Tokyo and
having sushi with her; and the fact that, if the plot of the two stories
was step-by-step identical, one could say that they were "the same
story, just using different names"; in particular, that Mr. Smith was
"the same as" Mr. Nagata, etc.? But I also said that if someone wrote
a novel in which the same Mr. Smith, Miss Jones, Mr. Nagata and Miss
Kobayashi were _all_ characters, then we could no longer say that Mr.
Smith was the same as Mr. Nagata, because if we did there'd be no way
of describing the interaction between them. So in the same way, we can
identify one mathematical object with another that is isomorphic to it
(in particular, K and iota(K)) as long as we are not dealing with a
structure to which they both belong as different parts. E.g., if K
and iota(K) were different subfields of C and we wanted to look at
equations relating elements of both these fields, we could not identify
them.
I hope this helps.
----------------------------------------------------------------------
You ask what one can adjoin (as in Definition 4.7, p.51) to R to
get Q.
Well, the shortest answer is "all elements of R" -- that will
certainly do it. Of course, there are smaller sets one can use;
e.g., the positive real numbers, or all numbers in the interval
[0,1], or all numbers in the interval [0, 0.00001], ... . It's
not hard to show that using any of these and elements of Q, one
can get all elements of R.
A better understanding of this sort of relation between Q and R
would go way beyond the reach of this course. One thing one can
certainly say is that any set X that, when adjoined to Q, gave
Q(X) = R, would have to be uncountable.
----------------------------------------------------------------------
You ask about the expression p(x_1,...,x_n)/q(y_1,...y_n) at the
bottom of p.53. That is poor notation on Stewart's part. I suppose
his thinking was that since X can be an infinite set, but each
polynomial can only involve finitely many variables, each of these
polynomials should be written using a finite list of elements of X;
and since the variables involved in the numerator may be different
from those in the denominator, he should use different symbols x_j
and y_j. But if he does so, there is no justification for taking
the number of variables in the numerator and denominator the same.
Of course, one can force them to be the same, by writing in some
variables that aren't actually involved. But in that case, one may
as well take the sets of variables in the numerator and denominator
to be the same, using the union of those that appear in one and in
the other. So the sensible alternatives to what he has are to write
p(x_1,...,x_m)/q(y_1,...y_n) with possibly different numbers of
variables, or p(x_1,...,x_n)/q(x_1,...x_n) with the same set of
variables.
----------------------------------------------------------------------
In your pro forma question you ask why, in the discussion on p.54,
i \in L' and \sqrt 5 \in L' imply L is contained in L'; and you
base your answer on the statement that every element of L has the
form a + bi + c \sqrt 5. But that is not true.
Every element of L = Q(i, -i, sqrt 5, -sqrt 5) has the form a + bi +
c \sqrt 5 + d i \sqrt 5. You could give an argument based on that
fact, but it wouldn't be a robust argument, because if you wanted to
reason similarly about field extensions generated by different
families of elements, you would have to figure out, in each case, the
form that elements of that field have.
A better reason is the following: L is defined to be the field
generated over Q by i and sqrt 5, i.e., the intersection of
all subfields of C containing those elements. So, since L' is
one of the fields containing those elements, L is contained in it.
Likewise, in proving L' = Q(i + sqrt 5) is contained in L, you
claim that every element of L' has the form a + b(i + sqrt 5), which
also isn't true. One can show that every element of L' has the form
a + b(i + sqrt 5) + c(i + sqrt 5)^2 + d(i + sqrt 5)^3 -- we will be
able to see that easily at the end of next week -- but we can get
the inclusion of L' in L without knowing this, simply from the
definition of L' = Q(i + sqrt 5) as the intersection of all subfields
of C that contain i + sqrt 5. Since L is a subfield containing
i + sqrt 5, L' must be contained in L.
----------------------------------------------------------------------
You ask what Stewart means on p.58, line 6, by "Separating out terms
of odd and even degree ...".
He has a polynomial p(t)\in Q[t]. Each term of this polynomial has
the form a_j t^j where j is a natural number. Each natural number
is either even or odd. Adding up the terms of even degree, one gets
a polynomial involving only even powers of t, t^{2h} = (t^2)^h; so
it can be regarded as a polynomial in t^2, a(t^2). Adding up the
terms of odd degree, one gets a polynomial involving only terms
t^{2h+1} = t . (t^2)^h; so it can be regarded as t times a
polynomial in t^2: t . b(t^2). Hence p(t) = a(t^2) + t b(t^2).
Now Stewart has assumed that p(sqrt pi) = 0. Substituting into the
above formula for p(t), and recalling that the square of sqrt pi
is pi, one gets a(pi) + (sqrt pi) b(pi) = 0, as he states.
----------------------------------------------------------------------
You ask about the meaning of the "field K(t) of rational expressions"
that Stewart uses in Theorem 5.3, p.58.
He defines them more precisely on p.175, last paragraph: K(t) is
the field of fractions of K[t].
----------------------------------------------------------------------
You ask about Stewart's excluding m(t) = 0 in the proof of Theorem
5.10, p.61. Well, as I have said, the wording of Stewart's definition
of irreducible on p.36 is definitely a mistake; as it is stated it
would allow 0 as an irreducible element; as I have modified it, it
does not. One wants to make one's definitions correspond to the
properties of natural interest, and the role of 0 in K[t] is so
different from that of what we are calling irreducible polynomials that
it is proper that the definition be chosen so as not to call them by
the same name. If one allowed 0 as an irreducible, Theorem 5.10 would
be false for the case m = 0. As you guessed, K[t]/<0> is isomorphic
to K[t] -- not a field.
(Actually, the ideal <0> in K[t] does have one important property
in common with the ideals for m(t) an irreducible polynomial.
Namely, both are ideals I such that xy\in I => x\in I or y\in I.
Such an ideal is called a "prime ideal"; an ideal I of a ring R is
prime if and only if R/I is an integral domain. 0 and the
irreducible polynomials are the only elements of K[t] which generate
prime ideals; as such they are called "prime elements". But we won't
be using these ideas in this course.)
----------------------------------------------------------------------
You ask whether we are ever interested in the case of
Theorem 5.10 (p.61) where m(t) is reducible.
For this course, no.
In other contexts yes; for instance, if m(t) is the minimal
polynomial of an n by n matrix A, then the ring generated by A
over the field K I_n of scalar matrices is isomorphic to K[t]/.
----------------------------------------------------------------------
You ask about how to compute inverses in fields K[t]/, shown
to exist in Theorem 5.10 (p.61).
The computation is essentially the same as that of Exercise 3.3
in the case where the h.c.f. of that exercise is 1, with m(t)
in the role of g(t). In K[t] you get 1 = a(t) f(g) + b(t) m(t),
so in K[t]/ this becomes 1 = [a(t)] [f(t)], since the term
[m(t)] is 0. Thus, [a(t)] is an inverse of [f(t)].
You also ask about the structure of K[t]/ when m(t) is
reducible.
There's too much to summarize in a brief e-mail, and the course
material doesn't leave me time to go off on tangents like that in
class, much as I would enjoy it. The answer depends on whether
m(t) has distinct factors or repetitions of one factor, or both.
The differences are like those between Z_n when n is a power
of a prime, a product of distinct primes, or a product of powers
of primes. If you're interested, ask in office hours and I can lead
you through some of the ideas.
----------------------------------------------------------------------
You ask (in connection with the vector space structure of an extension
field as described on p.67) what you can assume about linear
independence of elements of extension fields.
Only what you can prove, or what Stewart has proved for you!
Note that Lemma 5.14 (p.63) gives a powerful tool. In connection
with the example of the square root of 2 and the cube root of 2
(over Q) that you mention, that can be gotten by applying that
Lemma with alpha a 6th root of 2. (Do you see how?)
----------------------------------------------------------------------
You ask how much you have to know about cardinals, which Stewart
mentions on p.68.
You don't. As Stewart says there, if you are not familiar with
cardinals, just understand any non-finite-dimensional extension
as having degree "infinity", and interpret the tower law when
one or more of the degrees is infinity by the simple formulas he
gives at the bottom of the page. In the cases we are most interested
in the degrees will be finite.
If you want to learn about cardinals in the future, you might take
Math 135. (Math 104 and Math 55 usually give a tiny bit about the
subject -- the distinction between "countable" and "uncountable" -- but
there's a lot more to it.)
----------------------------------------------------------------------
You ask, in connection with Example 6.8 on p.71, whether
[\Q(a_1, ...,a_n): \Q(a_1,...,a_{n-1})] is either 0 or 2 depending on
whether a_n belongs to\Q(a_1,...,a_{n-1}) or not.
I assume that by "0 or 2" you mean "1 or 2".
The answer is -- yes if a_n is a root of a quadratic polynomial
over \Q(a_1,...,a_{n-1}) (and so in particular, if it is a root of a
quadratic polynomial over \Q). But if it's a root of a degree d
polynomial over \Q(a_1,...,a_{n-1}), all you can say is that it has
degree _< d -- namely, the degree will be the degree of its minimal
polynomial over \Q(a_1,...,a_{n-1}), which will be a divisor of
whatever polynomial you know it satisfies.
----------------------------------------------------------------------
You ask about Stewart's statement on p.71 that the degree of
Q{sqrt{6},sqrt{10},sqrt{15}} is 4 and not 8.
If you multiply sqrt{6} and sqrt{10}, and simplify as in
High School algebra, you'll get an expression in terms of which
you can express sqrt{15}. So Q{sqrt{6},sqrt{10}} contains
sqrt{15}, so Q{sqrt{6},sqrt{10}} = Q{sqrt{6},sqrt{10},sqrt{15}}.
There's no general test for roots of arbitrary polynomials; but you
can play around with this case and figure out what properties of
this extension by square roots makes it work, and hence find a general
result that will include it.
----------------------------------------------------------------------
You ask whether it would be simpler to find the degree
[Q(sqrt 2, sqrt 3, sqrt 5): Q] by the Tower Law rather than
the method of Example 6.8, pp.71-72.
But in that example, Stewart _is_ finding the degree using the
Tower Law! The Tower Law expresses that degree as the product
of the degrees of three intermediate extensions, and each of
those degrees is easily seen to be either 1 or 2. What he then
spends most of those two pages doing is showing that those degrees
are 2, not 1. For this one needs to know that the extension field
is strictly bigger than the base field; i.e., that the new element
one adjoins at each stage is not in the field one already has; and
that fact is what all the calculations he gives are aimed at getting.
----------------------------------------------------------------------
You ask on what basis Stewart implicitly assumes on p.79 that
[K_{j-1}(x_j,y_j) : K_{j-1}(x_j)] = 1 or 2.
The fact that y_j satisfies a quadratic equation over K_{j-1}.
Stewart has written down explicitly a quadratic equation satisfied
by x_j at the top of the page. He notes there, "The same holds for
the y-coordinates.
----------------------------------------------------------------------
You ask about the meaning of the phrase "duplicating the cube"
on p.80, Theorem 7.5.
Stewart gives you the meaning of this and the other classical
problems on p.75, end of next-to-last paragraph, "These ask,
respectively, for ...".
----------------------------------------------------------------------
You ask why the field K_0 is taken to be Q in Theorem 7.5, p.80.
Well, if we are going to duplicate the cube, we only need to be given
the length of one side of the given cube, and from it compute the length
of one side of the doubled cube. Given a segment representing one side
of the given cube, the easiest way to choose coordinates is to take one
end of the segment to be the origin and the other end to be (1,0);
and if we do so, we find that K_0 = Q(0,1) = Q. (In contrast, when
we want to trisect an angle, we must have more than just a line-segment
to represent the original angle.)
----------------------------------------------------------------------
You ask whether, on p.92, top paragraph, alpha_2 ^2 when applied to
a complex number x+iy doesn't give (x-iy)^2, which is not the
same as x+iy.
No, alpha_2 ^2 (x+iy) does not mean the result of taking
alpha_2 (x+iy) and squaring it; it means the result of applying the
square of alpha_2 to x+iy. Here "the square of alpha_2" means the
result of composing the function alpha_2 with itself. Since
alpha_2 of a complex number gives the conjugate of that number,
alpha_2 ^2 of a complex number is the conjugate of the conjugate,
i.e., the original number.
It's true that in some contexts, putting an exponent on a function
symbol means taking the operation that applies the function and then
forms a power of the result; e.g., sin^2 x means (sin x)^2. But
when one is talking about a set of operations under the operation of
composition, exponents are always understood to refer to composition of
the operations with themselves the indicated number of times.
----------------------------------------------------------------------
You ask whether, in finding the permutations of the roots of a
polynomial that determine the Galois group as on pp.91-92, we "only
have to make sure that each alpha in Galois(L:K) sends a_i to a root of
the minimal polynomial of a_i over K but, of course, only when that
root is in L?".
No. As I showed in class, the only permutations of the roots of the
polynomial t^3 - 2, namely 2^{1/3}, omega 2^{1/3}, omega^2 2^{1/3},
that determine elements of the Galois group of the extension
Q(omega, 2^{1/3}) : Q(omega), i.e., that respect algebraic relations,
are the cyclic permutations, even though a transposition that
interchanges two roots and leaves the third one fixed does preserve
minimal polynomials.
On the other hand, for Q(omega, 2^{1/3}) : Q, such a transposition
does induce a member of the Galois group.
----------------------------------------------------------------------
You ask, in connection with the definitions of p.93, whether
M^star dagger = M^star dagger star dagger must always hold.
Yes! In fact, even more strongly, M^star = M^star dagger star
always holds. (So the equation you ask about can be obtained by
applying "dagger" to both sides of this one.) And one doesn't need
to know any field theory to prove this -- just the fact that we have
two sets called L and Gamma, and some concept of an element of
Gamma "fixing" an element of L (we don't have to know what this
means), and that M^star is defined as the set of elements of Gamma
that "fix" all elements of M, and H^dagger is defined as the set
of elements of L "fixed" by all elements of H. Starting with that,
the rest is just simple logical reasoning. (One doesn't even have to
assume M a subfield of L; just a subset.)
Exactly the same reasoning shows that H^dagger = H^dagger star dagger.
There are many other cases of this pattern in mathematics. Based on
the famous case we are studying in this course, the pattern is called
a "Galois connection". I discuss it in my Math 245 course notes,
http://math.berkeley.edu/~gbergman/245
namely, in Chapter 5, starting on p.141, near the bottom. (I use "*"
there for both the operators that Stewart calls "star" and "dagger".)
----------------------------------------------------------------------
You ask about cases where H^{dagger star} is strictly larger than H
(p.93).
In the situation considered in Galois Theory -- where L is a finite
algebraic extension of K -- that cannot happen; we just aren't
ready to prove this yet.
For infinite extensions, algebraic or transcendental, it can. For
instance, let K = Q and L = Q(sqrt 2, sqrt 3, sqrt 5, sqrt 7, ...),
the extension gotten by adjoining the square roots of all primes.
Let H be the set of those automorphisms which act by reversing the
signs of the square roots of finitely many primes only, leaving the
other signs unchanged. (That is, for every finite set of primes, the
automorphism that changes the signs of the square roots of the primes
in that set belongs to H, and those are all the elements of H.)
It is not hard to show that H is a group, and that H^dagger = Q,
essentially because every element of L involves square roots of
only finitely many primes. But H^{dagger star} = Q^star, and this
consists of automorphisms that change the signs of the square roots of
arbitrary subsets of the primes; so it is properly larger than H.
----------------------------------------------------------------------
You ask what [L:K] is for L and K as on p.95.
We'll have the tools to answer that when we get to reading #22!
----------------------------------------------------------------------
You ask what map is described in the second line of Lemma 8.11,
p.97, as having kernel A_n.
The map from S_n to its cyclic quotient group of order p = 2.
Another way of saying this is that if N is a normal subgroup such
that S_n / N is cyclic of prime order p, then p = 2 and N = A_n.
----------------------------------------------------------------------
You ask about the notation (a b) on p.97.
An expression (a_1 ... a_k)\in S_n means the cyclic permutation of
{1,...,n} which sends a_1 to a_2, a_2 to a_3, etc., and finally
a_n to a_1. Here a_1,...,a_k are elements of {1,...,n}.
----------------------------------------------------------------------
You ask why, in the top paragraph of p.99, A_n fixes h_e and h_o.
The general principle is that if a group G acts on a set X, and
if N is a normal subgroup of G, then the set of elements of X
that are not moved by elements of N is closed under the action
of G. For let g be any element of G, and y any element of X
that is not moved by N. Then if we apply any n\in N to gy,
we get n g y = g (g^-1 n g) y. Since N is normal, g^-1 n g belongs
to n, so by assumption it does not move y, so n g y = g y, showing
that g y is also not moved by n.
Now A_n is a normal subgroup of S_n, so the above principle shows
that if h is not moved by elements of A_n, then sigma(h) (which
Stewart for no good reason is calling h^sigma on this page) will
also not be moved by A_n, so the same is true of 1/2 (h + h^sigma)
and 1/2 (h - h^sigma).
----------------------------------------------------------------------
You ask why, on p.99, line after second display, he says that A_n and
sigma generate S_n.
The index [S_n : A_n] is 2, so there are no intermediate groups, so
any subgroup of S_n that properly contains A_n must be S_n itself.
Since sigma is not in A_n, the subgroup that it and S_n generate
properly contains A_n, so it is S_n.
----------------------------------------------------------------------
You ask why on p.99, 3rd display, K(delta) is contained in K_2.
The author has shown that alpha_1 is in K(delta), so we have
K \subset K(alpha_1) \subset K(delta).
But (K(delta):K) = 2, a prime, so by the Tower Law, K(alpha_1)
must equal either K or K(delta). It was assumed that alpha_1
is not in K, so K(alpha_1) = K(delta). But
K(alpha_1) = K_1 \subset K_2,
giving the inclusion you ask about.
----------------------------------------------------------------------
You ask about the term "radical extension" used in the statement of
Theorem 8.14, p.100.
The errata on the homework I gave out on Friday tells you that before
reading this theorem, you should read the definition of "radical
extension" on p.153.
----------------------------------------------------------------------
You ask where the "c" came from in the 4th display on p.101.
P_j and P_k (I hope you made the correction shown on the homework
sheet, from Stewart's "P and P_j") are being assumed non-coprime; but
they are both monic, so each must divide the other. Since they have
the same degree, the only way this can happen is if one is a constant
times the other. "c" is that constant.
----------------------------------------------------------------------
You ask what Stewart means by "symmetry under S_n" in the proof
of Lemma 8.18 on p.102.
This is the concept that he discussed at the beginning of the
section. S_n acts on polynomials and rational functions by
permuting the subscripts; of one of these elements is invariant under
all these permutations, it is called a "symmetric" polynomial or
rational function. "Symmetry" means the property of being symmetric.
----------------------------------------------------------------------
You ask about replacing the definition of Sigma being the splitting
field of f over K on p.108 by a single condition, saying that
Sigma is generated over K by all the zeroes of f.
This would work for subfields of the complex numbers, where "all the
zeroes" means "all the zeroes in C". But although Stewart insists
that the only fields we are considering now are subfields of C, he is
looking toward the part of the book where he will allow more general
fields; and in that situation, there is not one big field where
everything lies and every polynomial splits, but different ways of
constructing extension fields where various polynomials split; so it
won't make sense to talk about a single set comprising "all the zeroes".
It is to give a definition that will continue to work in that situation
that Stewart uses the form he does.
----------------------------------------------------------------------
You ask whether the statement at the end of Example 9.2, p.108
doesn't contradict Definition 9.1.
A fact can't contradict a definition. A definition says how a word
is used, in this case, the word "splitting field". The only thing
that could contradict a definition would be if the author used the word
"splitting field" in a situation where the conditions stated didn't
apply.
----------------------------------------------------------------------
You ask whether condition 2 in the definition of splitting field
on p.108 is important.
Yes, definitely! For instance, Lemma 9.5 would not be true without
it. Neither would the fact that I sketched at the end of class
yesterday, that the splitting field of a polynomial f has enough
automorphisms to carry every root of f to every other. For
instance, Q(2^{1/2}) is the splitting field of the polynomial t^2 - 2
over Q, and it has an automorphism interchanging the roots of that
polynomial. On the other hand, Q(2^{1/4}) is not a splitting field,
and even thought it contains both roots of t^2 - 2 (namely, the square
of 2^{1/4} and the negative of that element) it has no automorphism
interchanging them: one of them (the positive one) has a square root
in the field while the other doesn't, making such an automorphism
impossible.
----------------------------------------------------------------------
You ask whether Stewart's use of sigma_i for the zeroes of f on the
first line on p.109 is because we will be considering permutations of
these elements.
No; he simply considers any lower-case Greek letter fair game as
a symbol for an element of an extension field. It would have been
better to have used a letter near the beginning of the Greek alphabet.
----------------------------------------------------------------------
Regarding Definition 9.10 on p.112, you ask
> What is a simple zero? I am pretty sure that it is just a zero with
> multiplicity 1, but I couldn't find a definition for it in Stewart
> (his index is all but worthless).
His index is as good as that of most math texts!
It's true that it doesn't have a listing for "simple", but since you
suspected that it means "multiplicity 1", you should have then looked
for "multiplicity" in the index. It sends you to p.44, where the
discussion of the subject begins; the formal definition is on p.45,
and you can see there that "simple zero" is defined as you suspected.
----------------------------------------------------------------------
You ask, regarding the proof of Lemma 9.13 on p.113, whether when
Stewart uses the term "irreducible factor of g", this should be
assumed to be of degree greater or equal to 1.
Right. As shown in my correction to the definition of "irreducible",
the words "of positive degree" need to be added to that definition
to fit the way Stewart actually uses the term.
----------------------------------------------------------------------
You ask why in the proof of Lemma 9.13, p.113, Stewart can say
"by induction g and Dg are coprime".
Well, you have to look back in the proof and see whether he sets up
an induction. And in fact, he says at the beginning of the paragraph
that he will prove the result by induction on curly-d f, the degree
of f. So you should check: will g necessarily have lower degree
than f ? Does the hypothesis of the statement that is being proved
by induction hold for g ? If so, then he can assume inductively that
the conclusion of that statement also holds for g .
(If, in looking back, you found that he did not explicitly set up
any induction, then you would have to ask yourself "Is there some
induction he might be considering to be straightforward, so that he
can use it without formally setting it up?" But in this case he did
formally set it up, so that isn't a problem.)
----------------------------------------------------------------------
You ask how, when we apply Lemma 9.13 to proving Prop. 9.14 on p.113,
we know that f and Df will have a common factor over K, and
not just over Sigma.
In the sheet of corrections that I gave out at the beginning of the
course, it says to correct "Sigma[t]" to "K[t]" in the last line
of the statement of Lemma 9.13. If you haven't made those corrections
yet, please do so -- understanding the material depends on it! (And
reading the proof of the lemma, you'll see that Stewart really does
prove that there is a common factor in K[t].)
----------------------------------------------------------------------
You ask what contradiction Stewart gets at the end of the proof
of Proposition 9.14, p.114.
Though he doesn't state it explicitly, the conclusion that f is a
constant contradicts its being irreducible. (The correction that I
made to the definition of irreducible really does correspond to the
way he uses it -- though he says after the definition that a constant
polynomial will be considered irreducible, throughout the rest of the
book he requires irreducibles to have positive degree.) It also
contradicts the assumption that f has a common factor of positive
degree with Df -- a nonzero constant polynomial can't have a factor
of positive degree.
----------------------------------------------------------------------
You ask whether we will need to consider infinite Galois groups or
extensions, since the counting principles used depend on finiteness
(p.117).
Not in this course! The bijectivity of the Galois correspondence
we will prove does not hold in the infinite case; but one can obtain a
bijective Galois correspondence by putting a topological structure on
the Galois group, and showing that intermediate fields correspond to
topologically closed subgroups of that group. This is done with the
help of the fact that every normal extension is a union of its finite
normal subextensions. I develop this result whenever I teach Math 250B;
cf. http://math.berkeley.edu/~gbergman/grad.hndts/infGal+profin.ps .
----------------------------------------------------------------------
You ask what is meant in the first sentence after display (10.3) on
p.118, which says that the relations are linearly independent "unless
lambda_1(y) = lambda_2(y) = lambda_3(y), and we can choose y to
prevent this."
Equation (10.3) depends on our arbitrary choice of y; one might say
that it is not one equation, but a system of equations, one for each
y. Some of these equations may be linearly dependent on (10.1), but
since lambda_1 and lambda_2 are distinct monomorphisms, there exists
an element y such that lambda_1(y) \not-= lambda_2(y). Choosing
such a y gives us an equation (10.3) that is not linearly dependent
on (10.1).
----------------------------------------------------------------------
You ask how Stewart gets the final "= 0" in the first display
on p.121.
From equation (10.6), substituted into the bracket on the preceding
line. (But if I have time I will show in class tomorrow a more
transparent way of getting the result of this calculation.)
----------------------------------------------------------------------
You ask how the first display on p.121 shows that g_1, ..., g_n are
linearly dependent.
Compare the first and last steps -- it shows that
y_1 g_1 + ... + y_n g_n = 0.
----------------------------------------------------------------------
You ask how Stewart can conclude that each of the coefficients in
the third from last display on p.121 is 0, as expressed by the display
after that.
He explains this in the line in between: The preceding display is
a system of linear relations like (10.8), but (10.8) was assumed to
have the smallest number of nonzero coefficients among all nontrivial
systems of relations of that form. But the new display has more
nonzero coefficients, so it cannot be a nontrivial relation; it can
only be the relation with all coefficients zero. This is exactly like
the proof of Lemma 10.1, where equation (10.4) was assumed to have the
minimal number of terms, so that in the third display on p.119, all the
lambda_i(x) have to have coefficient zero.
----------------------------------------------------------------------
You ask where the g_j's went in the first display on p.122.
Notice that he says on the preceding line "with j=1", and see the end
of the very first sentence of the proof, on p.120.
----------------------------------------------------------------------
You ask whether there is a more straightforward way
to do the proof of Theorem 10.5 (pp.120-122).
Not a fundamentally simpler way -- it really seems to be a "magical"
proof. But there are lots of little details that can be done more
nicely than Stewart does. I hope to show some tomorrow. As one
example, since linear relations like (10.8) can be multiplied by
any element of L and still remain valid, one can assume without
loss of generality that y_1 = 1. Then the messy business of
multiplying that equation and the one that follows by the first
coefficient of the other can be avoided; they will both have first
coefficient 1 and one can simply subtract one from the other.
----------------------------------------------------------------------
You ask how to see that there are only 4 candidates for Q-automorphisms
in Example 10.7(2), p.122.
He has just noted that there are only 4 values to which a Q-automorphism
might send zeta. Since zeta generates the field over Q, once we
specify where zeta is sent, this determines where every element is
sent. E.g., if zeta is sent to zeta^2 as in "alpha_2", then
s zeta^3 must be sent to s (zeta^2)^3 = s zeta.
----------------------------------------------------------------------
You ask why the linear relations among 1, $\zeta$, $\zeta^2$, $\zeta^3$,
$\zeta^4$ are generated by $\zeta + \zeta^2 + \zeta^3 + \zeta^4 = -1$
(p.123).
A linear relation among these elements corresponds to a linear
combination of 1, t, t^2, t^3, t^4 that gives 0 when zeta is
substituted for t. Such a polynomial must be a multiple of the
minimal polynomial of zeta, and that polynomial has degree 4, so
the only multiples of it in Q[t] that are linear combinations
of 1, t, t^2, t^3, t^4 are multiples by constants, i.e., constant
multiples of the one equation Stewart gives.
----------------------------------------------------------------------
You ask about Stewart's statement on p.123 that it is easy to find
the fixed field of the group he has just described.
The easiest way is to use {zeta, zeta^2, zeta^3, zeta^4} as basis
for the extension, so that a general element can be written uniquely
as x = q zeta + r zeta^2 + s zeta^3 + t zeta^4. Then the automorphisms
listed on p.122 will all fix x if and only if q = r = s = t. So any
element of the fixed field has the form q zeta + q zeta^2 + q zeta^3 +
q zeta^4 = q (zeta + zeta^2 + zeta^3 + zeta^4) = q (-1) \in Q.
----------------------------------------------------------------------
You ask how Theorem 11.3 (p.126) can be used to construct explicit
automorphisms.
To see that, you have to go back to the proof of Theorem 9.6, which
it calls on. And to see how that works, you need to go back to the
proofs of the results it calls on in turn.
----------------------------------------------------------------------
You ask in connection with Proposition 11.4 on p.126 whether a
K-automorphism sigma of a field extension L of a field K can send
a zero alpha of an irreducible polynomial f\in K[t] to an element
of L that is not a zero of f.
Definitely not! Since a sigma is a K-automorphism, it respects the
field operations of L and fixes elements of K. Since f(alpha) is
computed using the field operations of L and the coefficients of f,
which are elements of K, we have
sigma(f(alpha)) = f(sigma(alpha)).
The left-hand side is sigma(0) = 0. Equating the right-hand side to
0, we see that sigma(alpha) is also a zero of f.
----------------------------------------------------------------------
You ask why, in Chapter 11 we consider K-monomorphisms and not just
K-automorphisms.
I hope I made this clear in class: We need to use K-monomorphisms in
"building up" the K-automorphisms that are our ultimate interest. Thus,
Corollary 11.11, about K-automorphisms, couldn't be proved without the
inductive construction of K-monomorphisms in Theorem 11.10 (p.128).
As to why Stewart gives Theorem 11.13, about K-monomorphisms, after he
has already gotten Corollary 11.11, I don't really know. Perhaps he
will use it in a later chapter.
----------------------------------------------------------------------
You ask how Proposition 11.4 is used on p.129, line 4.
For each alpha_i that Proposition gives an automorphism taking
alpha to alpha_i. So altogether it gives the set of automorphisms
described.
----------------------------------------------------------------------
You ask how we know that the phi_ij, defined in the first display
on p.129, are distinct.
Good question!
Consider two such maps phi_ij and phi_i'j', with (i,j) not-= (i',j').
If i not-= i', then by construction tau_i and tau_i' have
different effects on alpha. But rho_j and rho_j' are
K(alpha)-automorphisms, so they both fix alpha. Hence the composite
maps tau_i rho_j and tau_i' rho_j' act on alpha in the different
ways that tau_i and tau_i' do, so they are unequal.
This leaves the case i = i'. Since we are assuming (i,j) not-=
(i',j'), we must have j not-= j'. Hence by inductive assumption,
rho_j not-= rho_j'. When these are composed with the same automorphism
tau_i = tau_i', the composites must also be distinct.
----------------------------------------------------------------------
You ask, in connection with the true/false question 11.7(d)
on p.131, for an example of an extension with Galois Group
of order 1 that is not normal.
The extension gotten by adjoining a cube root of 2 to Q. (This
was the first example we saw of a non-normal extension.)
----------------------------------------------------------------------
You ask how, on p.134, second line, we are to deduce normality of L:M
from Theorem 9.9.
By applying that theorem twice: The theorem first tells us that L
is the splitting field over K of some polynomial over K, since it
is assumed normal over K; we see from this that M is also the
splitting field of f over M; hence by a second application of
the theorem, it is normal over M.
----------------------------------------------------------------------
You ask what the 2-headed arrow on the upper right in Figure 13.1,
p.138 indicates.
It is marked "tau", and it means that the automorphism tau reflects
the picture over the diagonal line. Likewise, the four bent arrows
represent sigma, and show that sigma rotates the square of roots
90 degrees counterclockwise.
----------------------------------------------------------------------
In connection with Stewart's statement at the end of p.140 that
C^dagger, D^dagger and E^dagger are not normal, you ask for
an irreducible polynomial with a root in the last of these
fields that does not split in that field.
Since C^dagger and E^dagger are complex conjugates of one
another, and I discussed the former a bit in class, I'll answer
for it instead. I noted that the generator (1+i) xi of
C^dagger, when squared, gave -2i xi^2. From this one can deduce
that its 4th power is -8; i.e., it is a zero of t^4 + 8.
Now I claim that t^4 + 8 is irreducible, and that C^dagger
does not contain (1-i) xi, which is also a zero of t^4 + 8.
There are various ways to get these two facts. E.g., if
t^4 + 8 were reducible, then (1+i) xi would have degree < 4,
contradicting the fact that we know the field it generates has
degree 4; and (1-i) xi does not belong to that field by
the condition a_1 = a_5 on that page.
----------------------------------------------------------------------
You ask about extending the definition of solvable group (p.143) so
as to allow an infinite chain of proper subgroups (presumably, having
intersection the trivial subgroup).
Such a concept can be defined, but it has much weaker properties than
that of solvability. In particular, a homomorphic image G/N of a
group G with that property need not have it. In fact, any group
whatsoever can be written as a homomorphic image of what is called
a free group, though every free group has an infinite chain of normal
subgroups with abelian factor groups, and with intersection the trivial
subgroup. The reason this is possible is that distinct subgroups of a
group, even a family of distinct subgroups which intersect trivially,
can have the same homomorphic images in G/N. (For an example of this
phenomenon, consider the subgroups Z > 2Z > 4Z > 8Z > ... of the
integers, and what happens when we map Z to Z/3Z. Even though the
original subgroups have trivial intersection, their images in Z/3Z do
not.)
Anyway, to say a group G has a series of subgroups of the sort you
describe can be shown equivalent to the statement that for every
nonidentity element g of G, there is a solvable homomorphic image
G/N in which g has nonidentity image. A group with that property
is called "residually solvable".
----------------------------------------------------------------------
In connection with the warning at the bottom of p.143, that given
a chain of three groups, each normal in the next, the first may
not be normal in the third, you ask whether, if the first and
second are normal in the third, the first will be normal in the
second.
More is true: As long as the first is normal in the third and
contained in the second, it will be normal in the second. This
follows immediately from the definition of normality.
----------------------------------------------------------------------
You ask about the first isomorphism symbol in the next-to-last display
on p.145.
This is an application of (what Stewart numbers as) the First
Isomorphism Theorem, with the order of the two sides reversed.
(With this information you should be able to see what groups have
the roles of the H and A of that theorem.)
----------------------------------------------------------------------
You ask in connection with the concept of simple groups (p.146)
whether there is a relation between these and simple extensions.
No. "Simple" is used in both definitions the sense of its Latin root,
meaning "one-fold". But in the definition of "simple group" there is
a very strong concept of "one-fold-ness": the group represents a link
in a chain of normal subgroups where no further subdivision is possible.
In the definition of "simple extension" we have a different, and
rather weak sort of one-fold-ness: an extension that can be generated
by one element.
----------------------------------------------------------------------
You ask how, in the proof of Theorem 14.7, p.147, one concludes that
if a normal subgroup N of A_n contains one 3-cycle it contains all.
First, Stewart observes that "without loss of generality" the one
3-cycle that we know it contains can be assumed to be (123). This
is because the elements 1, 2, 3 of the set which our permutations
act on are no different from any others so far as the definition of
A_n is concerned; so if N <| A_n contains a different 3-cycle (pqr),
we could go through the same proof using p, q, r everywhere in
place of 1, 2, 3.
Then, assuming (123)\in N, Stewart proves that any 3-cycle (abk)
is in N. The key calculation is the second display on that page
(corrected as noted in the homework due right after the first
midterm) and the preceding sentence.
----------------------------------------------------------------------
You ask about the assumption at the top of p.148 that "without loss
of generality" N contains an element of the form (123)(456)y where
y fixes 1, 2, 3, 4, 5, 6.
Stewart is considering there the case where N contains an element
whose cycle-decomposition contains two 3-cycles; in other words, an
element of the form (abc)(def)y where y fixes a, b, c, d, e, f
and these six elements are distinct. Now in the set of elements which
S_n permute, any set of six distinct elements is like any other --
there is nothing in our assumption that singles one out. So if we
can prove the result in the case where our a, b, c, d, e, f are
1, 2, 3, 4, 5, 6, then we can prove it for any a, b, c, d, e, f by
just writing a, b, c, d, e, f for 1, 2, 3, 4, 5, 6 respectively
in our proof.
----------------------------------------------------------------------
You ask about an algorithm for finding the conjugacy classes (defined
on p.149) in a group G.
If G is given by its multiplication table, this is easy: It is not
hard to check that two elements of a group are conjugate if and only
if they can be written as xy and yx respectively for some x and
y in the group. So to find all the conjugates of an element a, one
takes the multiplication table, looks for the occurrence of a in
each column, reflects the locations of these occurrences about the
diagonal of the multiplication table, and the elements in the
reflected positions are the conjugates of a. (Of course, if one has
a computer that can multiply elements of a, this is not much easier
than using the original definition, and getting the computer to list
the elements b a b^{-1} for all b\in G. Either way, it's O(n)
steps. I assume that it is clear that to get a list of the conjugacy
classes, one computes the class of one element, takes all members of
that class out of consideration, computes the class of the first
element remaining, and so on until one has eliminated all the
elements.)
But that is just one of the possible ways a group could be described.
As another example, if one takes the group of all n x n matrices
over an algebraically closed field, the conjugacy classes correspond
to the distinct Jordan canonical forms (with no zeroes on the diagonal
to make these matrices belong to the group). That is really what
Jordan canonical form is about, and it obviously took some insight
into the structure of the group in question to find it -- not some
mechanical procedure! If one's base field is not algebraically closed,
there is something called the "rational canonical form", which is
messier; it is often described in Math 110 texts but not covered in the
course.
----------------------------------------------------------------------
I hope what I said in class clarified your final question about the
proof of Cauchy's Theorem on p.150. In the relation
$|C_j| = |G|/|C_G(x)|$
since $p$ divides the numerator of the fraction, but not the left-hand
side of the equation, it must divide the denominator of the fraction.
----------------------------------------------------------------------
You ask for an example where the analog of Cauchy's Theorem for
composite divisors is not true, other than A_5 noted in Exercise 14.6,
p.151.
Well, there are two ways one could try to generalize Cauchy's Theorem
to composite divisors. In his comment at the bottom of p.150, Stewart
tacitly translates Cauchy's Theorem as saying that G has a _subgroup_
of order p, making A_5 and n=15 an example of a group G and a
divisor n of |G| such that G has no subgroup of order n. A
simpler example of this is G = A_4, n = 6.
But if one takes the literal statement of Cauchy's Theorem, then
for its generalization to fail one merely wants a group G and a
divisor n of |G| such that G has no element of order n. For
this, any finite non-cyclic group will do!
----------------------------------------------------------------------
You ask about the difference between Definition 15.1 (p.153) and the
concept of solvability by Ruffini radicals (p.96). There's not very
much. I guess Stewart used the latter term if an extension that was
given turned out to be a radical extension in the sense of Definition
15.1, while he uses Definition 15.1 even when the radical extension
is not the extension we are interested in for itself, but a field
containing it. But that's not a very mathematical distinction. I'll
point this out to him. And on Monday or Wednesday I'll give an example
where the splitting field of a polynomial is contained in a radical
extension but is not itself radical. (I really already did give one:
any irreducible cubic with all three roots real. But I'll give a
concrete example where we can see the structure.)
----------------------------------------------------------------------
You ask about the definition of "radical degree" at the bottom of p.153.
You're right that this definition suggests that if n_j is a radical
degree for alpha_j, then any multiple of it will also be one, and
that from this point of view a more useful concept might be the least
n_j with the indicated property. But I think that the real idea here
is that we are specifying a sequence of elements and integers,
(alpha_1, n_1, ... , alpha_m, n_m) satisfying the condition of the
preceding display, and that we then manipulate such sequences, doing
such things as replacing a given pair alpha_j, n_j by a sequence
of pairs that make each integer in question a prime (as in Prop.8.9),
etc.. So the "radical degree of alpha_j" just means whatever integer
we have after alpha_j in the sequence we are working with at the
moment. Note, as an example, that if we are given two radical
extensions of K, say K(alpha_1, ... , alpha_m) and K(beta_1, ... ,
beta_n), and we want to show as I did in class that their composite
K(alpha_1, ... , alpha_m, beta_1, ... , beta_n) is a radical
extension, it is most convenient to use the exponents that we already
associated with the beta's, even though, after adjoining the alphas,
smaller exponents may work.
----------------------------------------------------------------------
You ask whether Lemma 15.5, p.155, remains true if p is replaced
by an integer n that is not a prime.
Yes, it does; but then it takes more work to show that the group of
nth roots of unity is cyclic, as needed by the proof. This is not so
hard when we are working with subfields of the complex numbers, since
e^{2 pi i / n} will be a generator; but harder in an abstract field,
such as Z/qZ for q an arbitrary prime. We'll prove this in
Theorem 20.8.
----------------------------------------------------------------------
You ask about the statement on p.156, 6th line of proof of Lemma 15.7,
that if alpha_1\in K, so that L=K(alpha_2,...,alpha_n), the
result of the lemma holds by induction.
Notice that the second paragraph of the proof began "We prove the
result by induction on n"; i.e., on the number of elements that are
adjoined (each having a prime power in the field generated by those
that precede) to get L from K. So in this induction we assume the
result is true of all extensions L:K that can be gotten in this way
using < n such elements. Now if alpha_1 \in K, then L=K(alpha_2,
...,alpha_n) so L can be gotten from K using < n such elements,
hence by that inductive assumption, the conclusion is true.
----------------------------------------------------------------------
You ask about the statement on p.156, 6th from last line of text,
that if we set epsilon = alpha_1/beta, then epsilon^p = 1.
By assumption, alpha_1 has pth power in K, and beta is a zero
of the minimal polynomial of alpha_1. Now since t^p - (alpha_1)^p
is a polynomial with coefficients in K (by assumption on alpha_1)
which is satisfied by alpha_1, it is a multiple of the minimal
polynomial of alpha_1, which by assumption also has beta as a
zero. So beta^p - (alpha_1)^p = 0.
Hence epsilon^p = (alpha_1/beta)^p = (alpha_1)^p / beta^p = 1.
----------------------------------------------------------------------
You ask about the notation $M(\alpha_1)(\alpha_2, \ldots, \alpha_n)$
in the second display on p.157.
It means the same as $M(\alpha_1, \alpha_2, \ldots, \alpha_n)$. The
author is writing it as he does to emphasize that we may look at $L$
as gotten by taking the field $M(\alpha_1)$ and adjoining $\alpha_2,
\ldots, \alpha_n$ to it.
----------------------------------------------------------------------
You ask whether the proof of Theorem 15.3 (pp.155-157) carries
over in a straightforward way to fields not contained in C.
Yes. A difference, which has been mentioned in class several times,
is that working within the complexes, "normal closure" means the
least subfield of the complexes containing our given field and normal
over our base field, while for general fields, it is something that
one constructs abstractly by adjoining enough roots to an appropriate
polynomial so that it splits. The one other difference is where
Stewart refers to properties of t^p - 1 in Lemma 15.5, and the
proof of Lemma 15.7. In these, t^p - 1 split for a nontrivial reason,
which is still valid in abstract fields whenever p is not the
characteristic; while if p is the characteristic, then t^p - 1
splits for a trivial reason: because it equals (t-1)^p.
----------------------------------------------------------------------
You ask why, in the first line of the proof of Theorem 15.3 on p.157,
M:K_0 will be radical if M:K is.
Note that K_0 as defined here contains K and is contained in M.
Hence if M is generated over K by a "radical sequence" (last line
of p.153) then it will be generated over K_0 by the same sequence,
and from the definition of radical sequence, a sequence that is a
radical sequence over K will still be one over the possibly larger
field K_0.
----------------------------------------------------------------------
You ask in connection with Definition 15.8 (p.158) whether normality is
of interest outside the fact that it characterizes splitting fields; and
in particular, whether it is of interest for infinite extensions.
Yes; in fact I would say that the reason for the emphasis on splitting
fields is that they have the property of normality. If normality were
not so powerful, then the natural extension of a field K to associate
to an irreducible polynomial f would be K[t]/, the field
gotten by adjoining a single root of f. Note that the Fundamental
Theorem of Galois Theory is true for finite normal separable extensions,
but not for other finite extensions.
Infinite normal extensions are indeed studied; however to state and
prove a version of the Fundamental Theorem of Galois Theory for these
requires topological as well as algebraic concepts, so it is not done
in courses like Math 114 or even 250A (though I develop it in Math 250B,
which gives the instructor a lot of leeway on what to include).
----------------------------------------------------------------------
You ask how Theorem 15.9 (p.158) is a "restatement" of Theorem 15.3.
This assertion of Stewart's is a bit sloppy. Theorem 15.9 is a
restatement of the case of Theorem 15.3 where L:K is normal, and
hence is a splitting field of some polynomial.
----------------------------------------------------------------------
You ask whether Stewart's assertion "(this also follows by
irreducibility)" near the bottom of p.159 is based on
separability of irreducible polynomials over C.
Right!
----------------------------------------------------------------------
You ask about Stewart's statement on p.163 that "... the machinery
needed to prove the existence of an algebraic closure is powerful
enough to make the concept of an algebraic closure irrelevant anyway..."
He isn't referring to what he does in this chapter, but in Chapter 17,
for which Chapter 16 is preparation. The gist of what he means is that
the process of constructing an extension of an arbitrary field K in
which all polynomials have roots (an algebraic closure) begins with the
construction of extensions in which we get roots of an arbitrary finite
set of polynomials; but this is all that Galois theory needs, so for
the purposes of Galois theory we may as well stop there. (What he
doesn't say is that to go from that construction to the construction of
an algebraic closure of K also requires considerable set-theoretic
machinery that we don't need for Galois theory; so it is really quite
fortunate that we can do without it.)
----------------------------------------------------------------------
You asked about integers modulo n when n is not positive, as
in the uncorrected version of line 4, p.165.
Whatever the value of n, it generates an ideal nZ, so one can
look at the quotient ring Z/nZ. In particular, when n is
negative, say -m where m is positive, this is what is usually
written Z/mZ or Z_m, while when n = 0 it is isomorphic to Z.
When m = 1, what we get fails to satisfy the condition "1 \neq 0" in
(M3) in Stewart's definition of a ring. Authors differ as to whether
to impose that condition. Leaving out that condition allows just one
additional structure as a ring: the trivial ring, i.e., the structure
with only a single element, which is both the 1 and the 0. What I
consider the best choice is to allow this as a ring, but to exclude
it from the definitions of field and integral domain by imposing the
condition 1\neq 0 there. This choice seems implicit in what Stewart
does, since on p.167 he constructs R/I for any ideal I, even
though I = R would make R/I the trivial ring, while on p.166
line 4 he says Z_1 is not a field. However, in order to minimize the
meddling I do with his definitions, I have not put such a change in
definitions into the errata.
----------------------------------------------------------------------
You ask how one proves part 4 of Example 16.4, p.165, i.e., that
in Q[t] not all elements have inverses.
If a polynomial p(t) has an inverse q(t), then p(t)q(t) = 1,
which has degree 0. Looking at the formula on p.20, first display,
for the degree of a product, we see that this is impossible if p
is taken to have degree > 0.
----------------------------------------------------------------------
You ask about Stewart's statement on p.166, item 10 near top, that
Z_1 is not a field because it doesn't satisfy the condition 1 not-= 0
of (M3).
Well, I think it was a mistake for him to include that condition
in his definition of a ring, because it makes the statement "R/I
is a ring for every ideal I" false for the ideal I=R; but I think
it should be kept in the definitions of integral domain and field,
because these are classes of rings with special important properties,
and some of those properties are lost if we allow 1=0. (Note that
if 1=0 in R, then multiplying by any x\in R, we get x = 0,
so a ring with 1=0 has just one element, the zero element.) So
we will continue to regard Z_1 as _not_ a field.
Fortunately, the details of which definition the condition 1 not-= 0
is put in -- the definition of ring or the definitions of integral
domain and field -- won't make for problems once we get into the meat
of the course, because we will be dealing mainly with fields and
polynomial rings over fields, so 1=0 will indeed hold.
----------------------------------------------------------------------
You ask whether in Case 1 on p.170, to prove that the ring of elements
n* is isomorphic to Z_p, one should show that n* = m* <=> [n] = [m]
in Z_p.
That is a key step. One also has to show that they add and multiply
like member of Z_p.
The quickest way to do all this together is to notice that m |-> m*
is a ring homomorphism, so its kernel is an ideal of Z, and recall
from Math 113 that every ideal I of Z is principal, and that if
(as in this case) it is not {0}, then it is generated by its smallest
positive element. In this case, that is p, so the kernel of
m |-> m* is pZ, so its image is isomorphic to Z/pZ.
----------------------------------------------------------------------
You ask why, in defining on p.172 the set S of ordered pairs (r,s)
from which he will construct the field of fractions of R, Stewart
requires s not-= 0.
The idea behind the construction is that (r,s) is the element
that will ultimately represent the fraction r/s, and we don't
expect that fraction to make sense if s = 0. As for what would
go wrong in the proof if we allowed s = 0, it is the verification
that ~ is an equivalence relation. For every pair (r,s) we see
from the equation defining the relation that (r,s) ~ (0,0), hence for
any two pairs (r,s) and (r',s') we have (r,s) ~ (0,0) ~ (r',s'),
but in general (r,s) ~ (r',s') will not be true.
You also ask about the need for R to be an integral domain in
verifying that the maps Stewart defines are operations (point 2 on
his "checklist). That condition is needed because otherwise in a
product such as [r,s][t,u] = [rt,su], the term su could be zero,
which as just noted, we can't allow.
----------------------------------------------------------------------
You ask how the construction of the field of fractions of an
integral domain (Theorem 16.16, p.172) shows that every integral
domain is isomorphic to a subring of some field.
The definition of "field of fractions" says that it has a subring
R' isomorphic to R, which is the condition you ask about.
(In the proof of the theorem, this comes out as point 4 at the
bottom of p.172: the map described is a monomorphism, and any
monomorphism gives an isomorphism between its domain and its image.)
----------------------------------------------------------------------
You ask in connection with the construction of fields of fractions
(p.172) whether every field K can be expressed as the field of
fractions of a proper subring A. This is a subtle question. If K
is finite, or even an infinite union of finite subfields, then one
cannot. In other cases one can, but to prove this in general requires
tools beyond this course. For instance there is no obvious way of
finding a proper subring A whose field of fractions is the field of
real numbers. One can start by saying "We want A \intersect Q to be
Z", and try to construct such an A by finding a set of elements which
when adjoined to Q gives R, and then adjoining them to Z instead;
but we have to be very careful in how we choose these elements so that
we don't get all of R in the process. It requires using something
called the Axiom of Choice (discussed in Math 135) together with a lot
of careful algebra.
----------------------------------------------------------------------
You ask "Do we think of a set of rational expressions as making up
a field?"
Yes! Stewart is a little vague in Chapter 4 about what "rational
expressions" are, but in Chapter 16 he is quite precise. On p.175,
at the beginning of the long final paragraph, he explicitly says what
he will mean by the phrase.
----------------------------------------------------------------------
You ask about the identification of K with iota(K) in the
paragraph above Theorem 17.3 on p.179, saying "Is this just because
iota is an inclusion, ... ?"
It isn't literally an inclusion. To "identify K with iota(K)" means
to pretend it is an inclusion, so as to keep the picture easier to
understand; in essence, to use the same symbols for elements
of K and their images in K[t]/I and think of them as the same.
But for each c\in K, iota(c) is the equivalence class I+.
----------------------------------------------------------------------
Concerning the definition of a splitting field, p.181, you write that
haven't been able to think of an example of a splitting field of
a polynomial that is not in the complex numbers or Z_p.
What kind of an answer to give you depends on exactly what you want;
in particular, whether it is the base field that you don't want to
be a subfield of C or Z_p, or the extension field, and whether you
literally mean "in", or mean "related to".
Every field either has characteristic 0 or characteristic a prime,
and fields with characteristic 0 look similar to subfields of the
complex numbers, while fields of characteristic p all have Z_p as
prime field; so in that sense, you can't get away from those two cases.
However, an example of a characteristic 0 splitting field where the
fields themselves do not lie in the complex numbers is the extension
L:K studied in sections 8.7 and 8.8 (defined on p.95, last display),
where both fields contain the field of complex numbers, but don't lie
inside it. Note that L is generated over K by t_1,...,t_n, which
are to roots of the "general polynomial", so it is the splitting
field of that polynomial. One can do exactly the same construction
starting with any field, including Z_p, in place of C. An example
which is similar in that it uses rational function fields is the
"t^p - x^p" one that I gave in class, which is similar to Example 17.16
in tomorrow's reading. For an example where the base field is Z_p,
but the extension field is not of the form Z_p, one can take any
irreducible polynomial over Z_p (e.g., t^2 + t + 1 when p = 2, or
t^2 + 1 when p = 3) and construct its splitting field by the "K[t]/"
method. (Exercise 16.6 not put in the homework is the first of these
examples.)
Finally, if you're just concerned with the extension field not
literally being inside the complex numbers, even though it may
be isomorphic to a subfield of the complex numbers, the example
I gave in class of a square root of 3 in the 2x2 matrix ring
over Q, and the field that it generates, which is isomorphic to
Q(sqrt 3), would do.
----------------------------------------------------------------------
You ask about the fact that Stewart seems to define the concept of
"adjoining" elements to a field twice within a few pages: in the
third line of section 17.2 on p.178, and in Definition 17.7 on p.181.
The two situations are not the same. In the former, we are given
elements of an extension field L of K, and "adjoining" them gives
a field K(X) containing K and X and contained in L. In the
latter, we are not given a field containing K; rather, we build
one, either as the field of fractions K(t) of the polynomial ring
over K, or as a factor ring K[t]/.
But, the latter case "retrospectively" takes the form of the former,
in that the extension field K or K[t]/ turns out to be
generated over K by the new element, t or t + ; so
it is reasonable to use the common term "adjoin" for both constructions.
----------------------------------------------------------------------
You ask about the statement in the bottom paragraph of p.181 that
splitting fields are unique up to isomorphism.
The justification is in the next sentence: Stewart observes that the
proof is the same as that of Theorem 9.6. So the reader is expected
to go back to that theorem and read through the proof again, making
sure that every step is valid for a general field, not just a subfield
of the complex numbers.
I am not enthusiastic about Stewart's leaving this verification to
the reader in an undergraduate text; but it is logically valid.
----------------------------------------------------------------------
You ask for examples of distinct splitting fields of the same
polynomial (p.181, bottom), and why splitting fields in C are unique.
Three examples of splitting fields over Q of t^2 - 2 are
(i) the subfield of C generated by sqrt 2, (ii) the field
Q[t]/, and (iii) the ring of 2 x 2 matrices over Q
generated by the scalar matrices and the matrix with 2 in the upper
right corner, 1 in the lower left, and 0 in the other two corners.
As to why splitting fields are unique in C -- if K is a field,
f an irreducible polynomial over K, and L an extension of K
in which f splits, then f has a unique splitting field of K in
L, namely the subfield of L generated by all the zeroes of f in
L. So it's nothing special about C; it's just the fact that we are
restricting the zeroes to all lie in a specified field, rather than
allowing them to be taken in different fields.
----------------------------------------------------------------------
In your question about Lemma 17.4, p.183, you say that if the
characteristic of k is p > 0, then k is isomorphic to Z_p.
Not so! The elements of k obtained by adding 1 to itself
repeatedly form a subfield isomorphic to Z_p, called the prime
subfield of k (Def.16.8, p.169), but this is not in general all of k.
We have seen examples where it is not. The easiest to describe are the
fields of rational functions in one or more indeterminates over Z_p,
i.e., Z_p(t), Z_p(t_1, t_2), etc.. Others can be obtained by adjoining
to Z_p zeroes of irreducible polynomials, as in the present homework
assignment.
----------------------------------------------------------------------
Regarding the top line on p.185 you ask why one has unique
factorization.
By Theorem 3.16, p.38. As with so many other results, Stewart proves
this for polynomials over subfields of C, but the proof he gives
works over any field, so he uses it in this later section without that
restriction.
----------------------------------------------------------------------
You ask about the last two sentences of the proof of Lemma 17.21, p.186.
We have seen that alpha has minimal polynomial g(t^p), while
alpha^p is a zero of g(t). So if we let n be the degree of g(t),
the minimal polynomial of alpha over K has degree pn, while the
minimal polynomial of alpha^p over K has degree _< n, which is
smaller than pn. So [K(alpha):K] < [K(alpha^p):K], so K(alpha)
and K(alpha^p) can't be the same field.
----------------------------------------------------------------------
You ask about the "it suffices" statement at the top of p.187.
If we prove the set of elements separable over K is closed under
the operations listed, then it will be a subfield of L. It clearly
contains K, and by assumption, it contains a set S of elements that
generate L over K, so it will contain the field K(S) = L, meaning
that L is separable over K, which is what we are trying to prove.
----------------------------------------------------------------------
You ask whether Proposition 17.18, p.185, can be proved algebraically,
without the use of formal derivatives.
Although the derivative of calculus is an analytic/topological notion,
the formal derivative of a polynomial over an arbitrary field is an
algebraic one -- it is defined and the properties we use are proved
without using limits, even though it is inspired by the analytic
concept. And it is a very powerful algebraic tool!
My understanding is that no way is known to get the corresponding
results without it. Incidentally, though the factorization properties
of integers and of polynomials over a field are very similar, the
formal derivative provides a way of telling whether a polynomial has
a multiple factor, but no analogous method is known that works for
integers. (So a polynomial of degree 100 can quickly be tested for
multiple factors, but a 100-digit integer cannot.)
----------------------------------------------------------------------
You ask whether the polynomial t^6 - t^3 + 1 over Z_3 (p.190,
last line) is separable.
It factors as (t^2 - t + 1)^3 = (t+1)^6. By Stewart's
Definition 17.19 it is separable, since all the irreducible
factors are separable. By everyone else's definition it is
inseparable, because it has multiple roots. That's why I
left out that question.
----------------------------------------------------------------------
You ask why the second line of p.193, starting with "Then", follows
from the first.
I guess you mean the second sentence, since the second line contains
no assertions, just the definition M = K(t,u). The sentence asserts
that t and u are independent transcendentals over K and then
on the next few lines makes a finiteness assertion. Here I will
guess that it is the former that you are asking about.
This follows from part (ii) of the lemma I put on the board on Monday,
which I noted was used by Stewart without explicit statement, and both
parts of which are implicitly proved in my proof of the Steinitz
Exchange Lemma. Statement (i) is that a family of elements are
independent transcendentals over K, and we bring in one more element
that is transcendental over the field generated by those elements,
then the resulting enlarged family is again a family of independent
transcendental elements over K. In this case, the "given family" has
just the one element t, and the additional element is u.
----------------------------------------------------------------------
> Bottom of 193 to top of 194
>
> I'm a bit confused as to the point of Stewart's comments "now
> interpreted as elements of K(\alpha_{1}, . . . \alpha_{n})" and the
> point that we are evaluating at t_{i} = \alpha_{i}. What's the point
> of these comments? ...
I'm not really sure. Probably the fact that earlier, the symmetric
polynomials we considered were elements of k[t_1,...,t_n] lying
within the field L of section 8.7; but here we are looking at the
result of substituting the alpha's into those polynomials, rather
than having the polynomials themselves as elements of our field.
(Though in the first half of section 18.3, he will again have the
polynomials as elements of his field!)
----------------------------------------------------------------------
You ask what is involved in generalizing Exercise 8.4 to an arbitrary
field, as called for in the proof of Theorem 18.8, p.194.
Nothing -- the same proof applies! Stewart says "generalized" simply
because the exercise was originally stated in a chapter where all
fields were assumed to be subfields of the complex numbers, and he
now wants to use it without that assumption.
----------------------------------------------------------------------
You ask why, on p.195, paragraph after 1st display, the fixed field of
S_n contains the symmetric polynomials in the t_i.
That is the definition of "symmetric polynomial" -- a polynomial that
is unchanged by all permutations of the indeterminates, i.e., by
the action on S_n.
----------------------------------------------------------------------
You ask why $f(t_n)=0$ holds in the fifth line of the proof of
Lemma 18.9, p.195.
I hope that my version of that proof in class made this clear:
f(t) is what one gets when one expands the product (t-t_1)...(t-t_n),
hence it has all of t_1,...,t_n as zeroes.
(Stewart goes through this in section 18.2.)
----------------------------------------------------------------------
You ask how Stewart gets the inequality
[K(s_1,...,s_n,t_n) : K(s_1,...,s_n)] _< n
on p.195.
If we call K(s_1,...,s_n) "M", this says [M(t_n):M] _< n, which
means that t_n satisfies a polynomial of degree _< n over M.
And that is what Stewart has just proved, using the polynomial f.
----------------------------------------------------------------------
You ask how one would come up with the relation s_j =
t_n s'_{j-1} + s'_j given on page 195, 3rd to last display.
Well, s_j is a symmetric polynomial in t_1,...,t_n, but one
wants to see what form it takes when we separate t_n from the
other t's. s_j will still be symmetric in those other t's, and
it's degree in t_n is 1, so it can be written as [?]t_n + [?], where
the two "[?]"s must be symmetric polynomials in t_1,...,t_{n-1},
since s_j is symmetric in those elements. So we look at what form
the coefficient of t_n in s_j has, and what form the set of
monomials not involving t_j have; we find that the former equals
s'_{j-1}, and the latter, s'_j.
----------------------------------------------------------------------
You ask what, in the proof of Lemma 18.11, p.196, ensures that
$(t_1,\ldots,t_n)$ are independent transcendental elements.
The words "With the above notation" in the statement of the Lemma,
where "the above notation" (same phrase used in Lemma 18.9) means
the assumptions of the first paragraph of the section.
----------------------------------------------------------------------
You ask about Stewart's putting ``over'' in quotation marks in the
statement of Definition 18.12, p.196.
Usually, a polynomial over a field means a polynomial with coefficients
in that field. But that is a case of the more general use of "over"
to mean "constructed by building something on top of". In this case,
the usage fits the wider sense but not the narrow sense; so Stewart
uses quotation marks to signal that there is a deviation from what
one would expect the term to mean based on the usual usage for
polynomials.
----------------------------------------------------------------------
You ask why the extension \Sigma : K(s_1,...s_n) on the 4th line of
p.197 is separable.
This is easy to see using the development I gave in class: by the
uniqueness of splitting fields, it is isomorphic to the extension
discussed from the beginning of the section to the middle of p.196,
which is separable because the zeroes of the defining polynomial
are the independent (hence, distinct) indeterminates t_1,...,t_n.
Using Stewart's development, I guess one has to argue the point
separately: If it were not separable, then t_1,...,t_n would not
be distinct, hence the transcendence degree of K(t_1,...,t_n) would
be < n, from which one could get a contradiction using Theorem 18.13.
----------------------------------------------------------------------
You ask in connection with the last sentence of p.197 why Hilbert's
Theorem 90 wasn't called Hilbert's Theorem 93.
It wasn't named after the year! There were a large number of numbered
results in his report, and this was the one he numbered "Theorem 90".
Evidently it stood out as important, but there wasn't any obvious
descriptive name to apply to it, so people started referring to it
by way it was numbered in his report. (Where Stewart says "its
appearance" he should have said something like "its designation".)
Whether it was the 90th theorem, or whether he used a common numbering
system for Definitions, Theorems, Lemmas, Remarks etc., so that it was
merely 90th among this larger collection, I don't know.
----------------------------------------------------------------------
You ask why, as stated on the line after the last display on p.199,
K(alpha) is the splitting field of t^p - a.
Each of the elements tau^j(alpha) = epsilon^-j alpha will be a zero
of that polynomial because they are images of one such zero under
members of the Galois group. Since there are p distinct such
elements, they form all the zeros of that polynomial. They all lie in
K(alpha) because of the assumption epsilon\in K, and they generate
K(alpha) over K since alpha is one of them; so K(alpha) is the
field generated by all the zeros of t^p - a, i.e., the splitting
field of that polynomial.
(When you saw this statement in Stewart, you should have asked yourself
"What has to be true for the assertion to hold?", and then you would
hopefully have seen that two things were needed: That p distinct zeros
of the polynomial be contained in K(alpha), and that they generate it.
You could then have asked yourself whether you could verify either
property; and if you got stuck, you could hopefully have at least
made your question narrower, e.g., "How do we know that all the zeros
of t^p - a lie in K(alpha)?")
----------------------------------------------------------------------
> Page 200, Paragraph above the end of proof box.
>
> How do we know that \phi^{-1}(H) is going to be normal in \Gamma(N:M)?
> I suppose this boils down to the question, why does [P:M] = p imply
> that P is normal, using Stewart's notation.
[P:M] = p doesn't imply that P is normal; look again at Stewart's proof
and you'll see that he gets those two facts by applying two different
parts of Theorem 17.23 to the two results of the preceding sentence.
It is those results that you need to see the justification of. I hope
what I said in class provided it, namely that in the situation in
question, phi is an isomorphism, so under phi, Gamma(N:M) "looks
just like" G; in particular, starting with the normal subgroup H
of index p in G, we get a normal subgroup of index p in
Gamma(N:M) by applying phi^-1.
----------------------------------------------------------------------
You ask why, as stated in the 2nd sentence of the 2nd paragraph of
the proof of Lemma 19.2, p.213, "to prove the lemma it is sufficient
to show that given (0,x) and (0,y) we can construct (0,x+y), (0,x-y),
(0,xy) and (0,x/y) ...".
Well, if we can do that, then the set of values of x such that we
can construct (0,x) by the operations of transferring the coordinates
of points of P to the y-axis, and then applying the operations listed
above, will be a field, namely the field generated by the coordinates
of all points of P. As Stewart notes in the preceding sentence, given
two points (0,x) and (0,y) on the y-axis, we can construct (x,y). So
for every (x,y) in the field generated by the coordinates of points
of P we can construct (x,y), as claimed in the lemma.
----------------------------------------------------------------------
You ask about the geometric constructions for multiplication and
division (p.213).
Since time is short, and no one else asked about these, I probably
won't get to them in class, so you should try to work through the
constructions in the book, and e-mail me or come to office hours if
you have questions.
What you write suggests that you are thinking of (x,0) |-> (1/x, 0) and
(x,0), (y,0) |-> (xy, 0) as the basic constructions. In the abstract,
these are the natural ones to start with, but the method that Stewart
shows instead begins with the operation (x,0), (y,0) |-> (x/y, 0),
because it happens to be very geometrically elegant. Once one has this
operation, one can apply it with x = 1 to get (y,0) |-> (1/y,0), and
then get multiplication via division: (x,0), (1/y,0) |-> (x/(1/y), 0)
= (xy,0).
So everything comes down to how to get (x,0), (y,0) |-> (x/y, 0).
This is illustrated in Fig.19.7, bottom of p.213. Can you follow it?
----------------------------------------------------------------------
You ask about the phrase "some suitable set of points" in the statement
of Lemma 19.3, p.214.
From Lemma 19.2 we know that as long as a set P of points contains
(0,0) and (1,0), we can get from P any other point whose coordinates
lie in the field generated by the coordinates of the members of P.
This means that there is a lot of freedom in what point-set we start
with. If alpha is a zero of t^2 + pt + q as in the proof, then
any set P which contains (0,0) and (1,0), and such that the
field generated by the coordinates includes p and q, will do.
----------------------------------------------------------------------
Regarding the proof of Lemma 19.3 on p.214 you ask
> ... if we can construct (0,sqrt(k)) how can we construct
> something in K(alpha)?
What Stewart says is, "... if we can construct (0,sqrt(k)) for any
positive k\in K ...". Assuming alpha has the form shown in the
display in that proof, the k that one needs to take the square
root of is p^2 - 4q. Using the square root of this element, and
the element p of K, and the operations of Lemma 19.2, one can
then get alpha.
----------------------------------------------------------------------
You ask about the Intersecting Chords Theorem, mentioned on p.214,
middle of page.
I mentioned it in my Friday preview of this reading: It says that if
you draw two chords in a circle, AB and CD, which intersect in a point
X inside the circle, then AX . XB = CX . XD (products of lengths).
To see its use in Figure 19.8, let AB be the line from (-1,0) to (k,0),
and (C,D) be the vertical line through the origin, with C and D
the points above and below the axis where this line crosses the circle.
----------------------------------------------------------------------
You ask whether the converse to Theorem 19.4 (p.214) is true.
Right! This was shown in the proof of Theorem 7.4, though it
wasn't put into the statement of that Theorem.
----------------------------------------------------------------------
You ask about the significance of the table of powers of 3 mod 17
on p.220.
I previewed this development last time precisely because I felt
it would be incomprehensible without motivation.
What I showed was that if we write epsilon for a primitive 17th
root of 1, each automorphism of Q(epsilon):Q took epsilon to
epsilon^m for some m unique modulo 17, and that for every nonzero
m\in Z_17, there was such an automorphism. I showed that composition
of these automorphisms corresponds to multiplication of nonzero
elements of the ring Z_17, and indicated that experimentation showed
that the automorphism tau taking epsilon to epsilon^3 generated
the whole group. This experimentation consists of computing the
powers of that automorphism, which in view of the composition law
described above, corresponds to computing the powers of 3 mod 17, which
Stewart does. He then puts together elements x_1, x_2 which will each
be invariant under tau^2, and elements y_1,...,y_4 which will each
be invariant under tau^4, using that table.
----------------------------------------------------------------------
You ask how a geometric construction follows from the expression for
cos theta (i.e., cos 2 pi/17) in terms of square roots on p.222.
Lemmas 19.2 and 19.3 show how a line segment of length indicated by
that expression can be constructed. As I pointed out in class, from
the cosine of an angle one can construct the angle, and from an angle
of 2 pi / n one can construct a regular n-gon.
----------------------------------------------------------------------
You ask how the notation e(G) (p.229) would be used in the case
of an infinite group G.
An infinite group can have finite or infinite exponent. For instance,
if F is a field of characteristic p, then the additive group of
the polynomial ring F[t] is infinite, but all elements of that
group have finite order, and the l.c.m. of those orders is p; so
e(F[t]) = p. On the other hand, if a group either has elements of
infinite order, or elements of infinitely many different finite orders,
then one has e(G) = infinity.
----------------------------------------------------------------------
You ask about Stewart's calculation Df = -1 in the proof of
Theorem 20.2, p.228.
It is correct. As you say, one has Df = q t^{q-1} - 1; but
the characteristic is p and q is a power of p, hence a
multiple of p, so as a coefficient, it equals 0.
----------------------------------------------------------------------
You ask whether Stewart is implicitly using Theorem 14.15 in the
proof of Lemma 20.6, p.229.
No. Theorem 14.15 only says that if a prime divides the order
of a group there will be an element of that prime order; it
doesn't give us any information about powers of primes. (For
instance, S_4 has order divisible by 2^3, but it has no element
of that order.) What Stewart is calling on here is the _definition_
of e(G), as an l.c.m., and the properties of l.c.m.s: if the l.c.m.
of a set of integers is divisible by p^n, then one of those integers
must be divisible by p^n.
----------------------------------------------------------------------
Regarding Example 20.10.2 on p.230 you ask
> How do we know GF(25) can be constructed as a splitting field for
> t^2-2 over Z_5 beforehand?
We know that any extension of Z_5 having degree 2 will be a field
of 25 elements, hence by the uniqueness Stewart has proved, will be
isomorphic to GF(25). It takes a few seconds of calculation to see
that 2 is not a square in Z_5; from that we can see that the
splitting field of t^2-2 will be an extension of degree 2.
> Do we generally construct GF(p^n) as a splitting field for t^n-x
> over Z_p, for any x\in Z_p?
We can obtain it by whatever method we find convenient, given our
knowledge of field theory.
> Then what about GF(p^p)?
Obviously, in this case we want a splitting field of an irreducible
polynomial over Z_p of degree p. You saw in homework one very
convenient sort of polynomial of that form.
----------------------------------------------------------------------
You ask about Stewart's reference to the 11th root of 1 as a primitive
11th root of unity on p.233.
What Stewart is doing on this page is criticizing the approach he has
taken so far, which has accepted roots of unity as "radicals" because
they satisfy equations t^n - a, namely with a = 1. Ordinarily, one
can think of a solution to t^n - a as "the nth root of a", but Stewart
points out that if one denoted zeta_11 by the symbol "11th root of 1",
the obvious interpretation of that symbol would be "1", which is not a
primitive 11th root of unity. The reason for the difference is that
t^11 - 1 is not irreducible, so that when we adjoin a zero of that
equation to a field, we have to distinguish which irreducible factor
of t^11 - 1 it is a zero of, t-1 or t^10+...+t+1. Only the latter
gives a primitive 11th root of unity.
In this informal motivational discussion Stewart is playing on the
ambiguity of the symbol "nth root of a". For n a positive real
number, that symbol has the precise meaning of the positive real
solution of t^n - a = 0, but there is also the loose sense of "any
solution of t^n - a = 0". He is implicitly saying that the loose sense
of the symbol might justify our thinking of zeta_11 as represented by
the symbol "11th root of 1", but that the more precise sense does not.
----------------------------------------------------------------------
I hope I showed where the "rabbit out of the hat" on p.236 came from.
As for why Stewart "pulls these things out of nowhere and then gives
no explanation", my guess is that he feels that this is what math
texts do in general, and that he is just being more honest and pointing
it out when he does it. Unlike him, I think that many of the techniques
we use can be successfully motivated. Perhaps he feels that students
can gain most by analyzing the techniques used after seeing them, rather
than having the explanations handed to them on a platter. And perhaps
there are some differences between the English and American educational
systems that make this more valid there than here. I don't know.
----------------------------------------------------------------------
You ask about the "Z_4 quotient group" referred to on p.239, line 4.
That should be "subgroup", not "quotient group" -- thanks for pointing
it out! As I discussed in class, the Chinese Remainder Theorem shows
that Z_20 is isomorphic to the direct product of Z_5 and Z_4 as
rings, hence the group of its invertible elements is isomorphic to the
direct product of the groups of invertible elements of those two fields,
which have the forms Z_4 and Z_2. Since the subgroup Z_4 comes
from the ring Z_5, and consists of those elements of the direct
product with identity-element in their Z_2 component, it corresponds
to the automorphisms that permute the primitive 5th roots of unity
while fixing the primitive 4th roots of unity, +-i. So it corresponds
to the Galois group of Q(i,zeta):Q(i), equivalently, Q(xi):Q(i).
----------------------------------------------------------------------
You ask whether the "k" in Definition 21.1, p.241, is a single value,
or varies from one step to another.
Good point. As you guessed, he means it varies from one step to
another, though his wording doesn't make that clear.
----------------------------------------------------------------------
The three of you asked why, on p.242, first paragraph, we have
Q(theta, zeta) = Q(theta zeta).
The multiplicative orders of theta and zeta, namely p-1 and p
are, as Stewart says on the first line, coprime. It follows that
the order of their product is the product of their orders. (For
elements of coprime orders in an arbitrary finite abelian group, I
proved this in my lecture of April 22, in clarifying the proof of
Lemma 20.6.) Hence the subgroup generated by both of them will be
cyclic with that product as a generator. Hence Q(theta zeta)
contains both of them, hence contains Q(theta, zeta). The reverse
inclusion is immediate.
----------------------------------------------------------------------
You ask about the comment on p.243, 3rd and 4th lines below
display (21.9), that the maximum radical degree is max(2,(p-1)/2).
Well, in that remark p is assumed an odd prime, so p-1 is even,
so (p-1)st roots can be expressed in terms of square roots and
(p-1)/2-th roots.
----------------------------------------------------------------------
You ask where in the argument on p.246 Stewart uses the assumption
m_{[\epsilon^{p}]}(t) \notequal m_{[\epsilon]}(t) made on line 7 of
that page.
In the middle of the page, where he says "Therefore,
m_{[\epsilon^{p}]}(t) and m_{[\epsilon]}(t) have a common zero
... so that \bar{t^n - 1} has a repeated zero ...". If they were
the same factor of \bar{t^n - 1}, having a zero in common wouldn't
make that a repeated zero.
The same argument could be done without contradiction. Without
assumption as to whether these two divisors of \bar{t^p - 1} are
distinct, one could show as he does that they have a common zero, hence
since \bar{t^n - 1} has no multiple zeros, each zero is the zero of
one factor only, hence these factors with a common zero must be the
same.
----------------------------------------------------------------------
The question you handed in is from section 22.2, which we're not
covering!
However, here is the answer. You wanted to know how Stewart comes
up with the elements phi and psi used on p.253.
To build up the splitting field of f step by step, we want to
construct first the fixed field of the normal subgroup A_3, and then
from it the whole field. The fixed field of A_3 will consist of
elements that are invariant under cyclic permutations of alpha_1,
alpha_2, alpha_3, but not necessarily under other permutations. The
easiest expressions to examine are polynomials in the alpha's. Note
that any polynomial in a set of elements that is invariant under a
group G of permutations will be a linear combination of "orbits" of
monomials under G. Experiment, and you will find that any monomial of
degree 1 or 2 will have the property that its orbit under A_3 is also
invariant under S_3, so it will give an element of the base field
rather than generating the extension we want. When we reach degree 3,
we find that some of the monomials still have that property, but the
monomial alpha_1^2 alpha_2 is sufficiently "asymmetric" so that
taking its orbit under cyclic permutations doesn't completely
symmetrize it. So we use its orbit, the sum of which Stewart calls
phi. To simplify the argument that follows, he also throws in the sum
psi of the orbit of alpha_1 alpha_2^2, the image of phi under any
element of S_3 that is not in A_3.
----------------------------------------------------------------------
You ask whether for each n there is a formula for the discriminant
of a polynomial f of degree n in terms of its coefficients, like the
one Stewart gives for the cubic on p.256.
Yes, because the discriminant is a symmetric polynomial in the
zeroes of f (we saw that it is unchanged under permutation of the
zeroes), and every symmetric polynomial is expressible in terms of the
elementary symmetric polynomials (Exercise 8.4), which are given
by the coefficients of f.
----------------------------------------------------------------------
You ask what Stewart means, on the line after the 3rd display on p.257,
by "expand in powers of t".
He means multiply out the big product (n! terms) and write the result
as a polynomial in t; i.e., collect those terms that don't involve
t and make their sum the constant term of the polynomial, collect
those terms with a single factor t and make their sum the linear
term of the polynomial, and so on, all the way to the t^{n!} term.
----------------------------------------------------------------------
> I had trouble with the proof of Thm 22.8 on page 258, especially the
> third sentence below the first display in the proof. I didn't see why
> Q_1 is one of the factors, ...
Well, if you'd looked closely at the reason Stewart gives, you would
have seen that he says "because y - beta divides H", and you would
have asked "What is y?", and have been the first to discover another
typo. That should be "t - beta".
So: t - beta is one of the factors of H over Sigma by definition
of H. Now the irreducible factors Q_1, ..., Q_k of Q, when looked
at over Sigma, are gotten by collecting together different factors
t - sigma_x(beta) in the definition of Q, as discussed in the
next-to-last paragraph of p.258. By assumption, Q_1 is the one
and only one of the Q_j's that has t - beta itself as a factor. So,
since we have observed that H also has t - beta as a factor, Q_1
must be one of the Q_j's that H is the product of.
----------------------------------------------------------------------
You ask about the generalization of Gauss's Lemma that I remark (in
the corrections) that Stewart implicitly calls on on p.258.
That generalization is gotten by replacing the base ring "Z" in our
statement of Gauss's lemma by any unique factorization domain, i.e.,
ring in which every element has a factorization into irreducible
elements that is unique up to multiplication by invertible elements.
The proof is virtually the same. Then one proves (with the help of
that lemma, and the fact that polynomials over a field form a unique
factorization domain) that for any unique factorization domain A, the
polynomial ring A[t] is also a unique factorization domain. Then
one verifies by induction on n that polynomials in n indeterminates
over a field form a unique factorization domain, and hence that Gauss's
lemma is applicable to them. The arguments are all elementary; this is
regularly covered in Math 250A, and sometimes, I think, in Math 113 as
well. Even if it isn't taught, it's likely to be in a 113 textbook.
----------------------------------------------------------------------
You ask about the feasibility of the factorizations that Stewart's
algorithm (pp.257-258) calls for.
I've never studied these computational questions in practical terms.
But cf. Exercise "3.19" on the second homework assignment sheet.
You also ask for an example of its application more complicated than
the quadratic he gives at the end.
I shudder at the thought!
----------------------------------------------------------------------
You ask whether there is an "easiest way" to compute bold-G (p.258)
for polynomials of relatively high degree.
The first problem is how to factor Q into irreducible Q_1 ... Q_k.
Exercise "3.19" on the second homework assignment sheet shows that for
base field the rational numbers, there is an algorithm for factoring
polynomials in one variable. I would imagine that similar methods could
be developed for polynomials in several variables; but what their level
of computation difficulty would be I don't know. Once one has Q_1,
it should be relatively easy to see which elements of S_n it is
invariant under, thus giving bold-G -- unless, of course, n is
really large, say around 20, so that Q_1 has billions of terms,
filling up a giant database, in which case even that problem could
require great ingenuity for a programmer. Although in that case, it's
likely that one would get stuck before that stage in trying to factor
Q, or even write it down.
----------------------------------------------------------------------
You ask how renumbering the roots changes Gamma(f) to a conjugate
subgroup, as stated on p.251.
I talked about this in class on Monday in connection with that
day's reading. To be precise, it is the subgroup of S_n corresponding
to Gamma(f) that is changed to a conjugate. I gave the example of
the polynomial t^4 - 2. If we let alpha_1 = 2^{1/4},
alpha_2 = i 2^{1/4}, alpha_3 = -2^{1/4}, alpha_4 = -i 2^{1/4}, then
the permutation (1234) belongs to the Galois group (it is the
rotation sigma of Fig.13.1, p.183). But if we number them so that
alpha_1 = 2^{1/4}, alpha_2 = -2^{1/4}, alpha_3 = i 2^{1/4},
alpha_4 = -i 2^{1/4}, then that automorphism is represented by
(1324), while (1234) ceases to represent an automorphism, since
an automorphism sending 2^{1/4} to -2^{1/4} must clearly send
-2^{1/4} back to 2^{1/4}.
----------------------------------------------------------------------
You ask why, in the proof of Theorem 22.7, p.256, we have to
require char K not-= 2.
The author shows this explicitly in the proof, when he says "since
char K not-= 2 we have delta not-= -delta."
----------------------------------------------------------------------
You're right that in the proof of Theorem 22.7, p.256, the question of
separability needs to be considered in getting statements 1 and 3 of
that theorem.
However, this is taken care of if we start with the (trivial) proof of
statement 2. That doesn't use Galois theory, and the result shows that
if f is inseparable, then Delta(f) = 0. Hence in proving statement
1 only the separable case needs to be looked at, and in statement 3 the
inseparable case is excluded by the assumption "Delta(f) not-= 0" that
I added in the errata.
----------------------------------------------------------------------
You ask what Stewart means by "conjugate subgroups" and "conjugacy
classes of subgroups" on pp.251-252.
For any g\in G, "conjugation by g" means the map h |-> g^{-1} h g
(or, depending on the author, g h g^{-1}. Since conjugation by g
under one definition corresponds to conjugation by g^{-1} under the
other, the two definitions lead to the same set of "conjugation maps",
also called "inner automorphisms of G"). Subgroups H_1 and H_2
are then called conjugate if there is some g\in G such that
H_2 = g^{-1} H_1 g. Being conjugate is an equivalence relation on
subgroups, and the equivalence classes are called conjugacy classes
of subgroups. (Being conjugate is likewise an equivalence relation
on elements, and the equivalence classes are called conjugacy classes
of elements.)
----------------------------------------------------------------------
You ask what is "a good way to get the Galois group of an irreducible
polynomial".
Certainly not the method of section 22.4, as Stewart indicates in
the very first paragraph of that section, p.256.
So far as this course is concerned: Study the algebraic properties
of the zeros, and use whatever particular facts about these you can
come up with. In connection with Exercise "21.15(b)", the hint
suggests the particular facts to use.)
----------------------------------------------------------------------
You ask, in connection with the last paragraph of section 22.1, p.252,
what the 5 conjugacy classes of subgroups of S_4 and of S_5 are.
For S_4 you found them all: S_4, A_4, D_8, V, and the cyclic
subgroup <(1234)>. (Three of these are normal; for the other
two we have to say "the conjugacy class of ---".)
For S_5, note that if G is such a subgroup, then for any
x\in {1,2,3,4,5}, the orbit-size formula |G x| = [G:G_x] shows that
|G| must be divisible by 5, hence by Cauchy's Theorem, G must
contain an element of order 5, which must be a 5-cycle. By a
conjugation, we can assume without loss of generality that this is
(12345). So we only need to consider subgroups containing that
element. See how far you can go from there in finding different
subgroups. If you fall short of 5, I can show you how to finish.
Incidentally, the argument used to show that any conjugacy class can
be assumed to contain (12345) generalizes to any prime n, and is
probably the reason for the fact mentioned by Stewart, that the number
of conjugacy classes tends to be relatively small when n is prime.
----------------------------------------------------------------------
You ask why at the end of the proof of Theorem 22.7, p.256, we can say
that G^dagger = K.
By the Fundamental Theorem of Galois Theory!
It's true that to apply this, we have to verify that f is separable.
But that follows from statement 2, and the assumption I added to
statement 3 in one of my errata, that Delta(f) is nonzero.
----------------------------------------------------------------------
In your answer to your pro forma question on the behavior of delta
under elements of the Galois group (p.256), you say that a transposition
changes the sign of delta because it changes the sign of exactly one
factor.
This is not quite true. A transposition (i, i+1) only changes the
sign of (alpha_i - alpha_{i+1}), but a transposition (i, j) with
j-i > 1 also changes the signs of all factors (alpha_i - alpha_k)
and (alpha_k - alpha_j) for all k between i and j. However,
the number of factors of the former sort equals the number of the
latter sort, so their effects on the sign cancel out, so that
(alpha_i - alpha_{i+1}), the one term whose sign is changed that is
not involved in this cancellation, causes the sign of the whole
product to change.
----------------------------------------------------------------------
You both ask about the need to study pi analytically, as on p.270.
The only definition of pi that we have is analytic. It can be posed
in various ways; the definition used implicitly in this section is
that pi/2 is the smallest positive real number x making cos x = 0.
The classical definition is as the ratio of the circumference to the
diameter of a circle, but the easiest way to make this precise is to
define the circumference as an integral, which is again an analytic
definition. The analytic definitions do not lead to any algebraic
characterization (I don't count exponentiation by a non-rational number
as an algebraic operation) so the only methods we have to study it are
analytic. (If its properties did lead to an algebraic characterization,
this would make it an algebraic rather than a transcendental number.)
----------------------------------------------------------------------
You ask why J_n is an integer in the computation on p.270, given
that the expression for it contains polynomials in pi/2.
The conclusion that it is an integer is obtained under the assumption
that pi is rational (after multiplying out by the appropriate power
of the denominator of pi). This is used to get a contradiction.
----------------------------------------------------------------------
You ask whether the results of this reading (p.270) might be proved
using the theory of transcendental extensions.
I don't think so. That theory concerns the relations among
transcendental elements, but doesn't give us a way to come up
with transcendental elements to start with.
----------------------------------------------------------------------