ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 114, Spring 2001, taught
from Ian Stewart's "Galois Theory".

_________________________________________________________________
You ask about the inclusion  Q \subset Q(gamma,delta) \subset
Q(alpha,beta,gamma,delta) on p.xxviii, saying that you would expect  Q,
as the field of all rationals, to contain both these fields of rational
expressions.

I think the problem involves the phrases "rationals" and "rational
expressions".  The field of "rationals" consists of the rational
numbers, i.e., elements  m/n  where  m  and  n  are integers.
The field of "rational expressions" in, say, gamma and delta consists
of all elements that can be gotten using integers, gamma and delta,
addition, subtraction, multiplication and division.  So, for example, if
gamma = sqrt 5 and delta = - sqrt 5,  then (3 + 7 sqrt 5) / (9 - sqrt 5)
is a member of  Q(gamma, delta),  but not a member of  Q.

_________________________________________________________________
You ask about the relation between the Euclidean algorithm and
the Chinese Remainder Theorem.

Well, if  R  is a ring, and  a, b  elements of  R  with hcf = 1,
then we have the implications:

        (R satisfies Euclidean algorithm)
                ||
                \/
        (Every ideal of  R  is principal)
                ||
                \/
           (aR + bR = R)
                /\
                ||
                \/
        (Every pair of congruences
        x == u (mod a), x == v (mod b)
        has a solution in  R)

	(Here "==" stands for "congruent to".)

But neither of the first two implications is reversible:  There
are rings  R  which do not satisfy the Euclidean algorithm for any
"Euclidean norm function" (was that concept introduced in your previous
classes?) but which have the property that every ideal is principal,
and in a ring in which not every ideal is principal, there may still be
interesting examples of elements  a  and  b  such that  aR + bR = 1,
so that by the last implication, one has the "Chinese Remainder
Theorem" for those elements.  So the Chinese Remainder Theorem applies
to a much wider class of cases than the Euclidean algorithm.

If you want further references for one or another aspect of this,
let me know.  (Examples of rings in which every ideal is principal,
but which don't satisfy any Euclidean algorithm, actually take
a bit of work to come up with.  If you look on my web page,
http://www.math.berkeley.edu/~gbergman and click on "Handouts for
graduate algebra classes", one of the items you will find there
is "A principal ideal domain that is not Euclidean, developed as a
series of exercises".  It is intended for students who have had the
first few weeks of Math 250A; you can look at it and see how much you
can follow.)

_________________________________________________________________
You ask why the definition of a field as a ring such that  F \ {0}
is an abelian group under multiplication implies that  ab = 0 =>
a = 0  or  b = 0.

Well, consider two elements  a, b  of  F  neither of which is  0.  This
says both lie in  F \ {0},  hence since this is a group under
multiplication, it must contain their product.  And to say that  ab
is in  F \ {0}  is to say that  ab  is nonzero.  (This proves the
statement in contrapositive form:  If it is not true that  a=0 or b=0,
then it is not true that  ab=0.)

_________________________________________________________________
You ask what examples authors like Stewart wish to include when they
allow rings not to have 1.

I think the main justification generally given is that this allows
results proved about rings to be applied to ideals, which generally
don't contain 1.  But I find that fairly specious; the things one
wants to know about ideals can generally be seen by looking at them
as subsets of the rings they are ideals in.

But there are some nonunital rings that arise naturally.  For
instance, the ring of "strictly upper triangular nxn matrices",
i.e., matrices of the form
 / 0 * * * \
|  0 0 * *  |  where each "*"s represents an arbitrary element of the
|  0 0 0 *  |  base ring (e.g., the real numbers), and the "0"s
 \ 0 0 0 0 /   represent 0.

For another example, if  V  is an infinite-dimensional vector-space,
then the set of all linear transformation  V -> V  whose ranges are
finite-dimensional forms an interesting ring without 1.  And similarly,
if we take all sequences  (a_0, a_1, a_2, ... , a_n, ...),  say of
real numbers, and make these a ring (with 1!) by componentwise
operations, i.e., defining

(a_0, a_1, a_2, ... , a_n, ...) + (b_0, b_1, b_2, ... , b_n, ...) =
(a_0+b_0, a_1+b_1, a_2+b_2, ... , a_n+b_n, ...)  and likewise

(a_0, a_1, a_2, ... , a_n, ...) x (b_0, b_1, b_2, ... , b_n, ...) =
(a_0 b_0, a_1 b_1, a_2 b_2, ... , a_n b_n, ...)

then the set of sequences with only finitely many nonzero terms forms
a nonunital subring.

In each of these cases, the subring in question is an ideal of a
larger ring with unit.  (In the last two cases, that ring is fairly
obvious:  just drop the "finiteness" condition from the description
given.  In the matrix case, it is the ring of upper triangular matrices,
not required to be "strictly" upper triangular; i.e., may have arbitrary
entries on the diagonal.)  But these nonunital examples are of some
ring-theoretic interest for themselves, not just as ideals of larger
rings.

So I don't say that nonunital rings shouldn't be looked at.  Just that
they are not an important enough concept to muddy the waters with
in a beginning course.

You also ask how Stewart gets the first display on p.3.  The second
of the two arguments you indicated seems the most straightforward:
use the fact that  x |-> I+x  is a homomorphism, and apply it to
the equation ar+bn=1,  rewritten  ar = 1-bn.  I assume this is what
Stewart meant.

Your other argument involves writing "I + 1 - bn", and requires one
to decide what one means by that expression.  If you mean to set up
a definition in which one can add any set of elements of a ring to
any other (and regard 1 and -bn as abbreviations for the corresponding
one-element sets), then you need to ask whether such addition of sets
is associative, so that you can safely write expressions without
parentheses.  It is, but one needs to prove it.  Alternatively, you may
mean  "I + (1-bn)",  i.e., the coset of the ideal  I  containing the
element  1-bn.  In that case, the argument is essentially the same as
the first one.

All this shows one reason I don't like the "I+x" notation for elements
of a quotient ring  R/I !

_________________________________________________________________
You ask about the equation  (I + r)(I + s) = I + rs  on p.2.
That equation is the _definition_ of multiplication in a
factor ring, so it is not something one can or should prove.

_________________________________________________________________
Yes, what Stewart calls an "hcf" is what is often called a "gcd".
He defined the term on p.11, lines 3-4.  This may be a difference
between British and American usages; at least a difference of which is
used more often in which country.

_________________________________________________________________
You both asked about the statement near the middle of p.21, that
"f(t) obviously irreducible if and only if f(t+1) is irreducible".

I intended to speak about it in class, but ran out of time!

Basically, the point is that one can not only substitute for the
indeterminate  t  in polynomials  f(t)\in K[t]  any element alpha \in K,
getting homomorphisms  K[t] ---> K;  one can substitute any element
alpha  of any commutative ring  R  containing  K,  getting homomorphisms
K[t] --> R.  In particular, taking  R = K[t]  and  alpha = t+1,  one
gets a homomorphism  K[t] --> K[t]  taking  f(t)  to  f(t+1).  This has
an inverse, the homomorphism taking  f(t)  to  f(t-1).  Hence it is an
isomorphism, from which one can easily deduce that an element  f(t)
of  K[t]  is irreducible if and only if the image of that element
under this homomorphism, i.e.,  f(t+1),  is irreducible.

_________________________________________________________________
You ask about the meaning of "highest coefficient" in the discussion
before the last example on p.21.  Stewart means "coefficient of the
highest power of  t".  So, for instance, in the display earlier on
the page, beginning "9f(t)", the highest coefficient of the polynomial
on the right is  2 .

As to the "graph" of a polynomial over  Q  (p.24, top), one simply
regards  Q  as a subset of  R,  and marks points on the graph in
the same was as one does in  R.  Since  Q  is dense in  R, there is
no real way of seeing the difference between such a graph and the
graph of a function  R --> R,  unless one artificially draws
it with a "dotted line".  But the interpretation of such a graph
is tricky:  The graphs of  t^2 - 1  and  t^2 - 2  both seem to
have zeroes, but over  Q,  the former does, while the latter does
not.  So Stewart mentions the idea of graphing functions on  Q  only
in parenthesis.

_________________________________________________________________
You ask what Stewart means in Theorem 2.3 by "unique up to order
and constant factors."

He means that given two factorizations of  f,  one can be gotten
from the other by rearranging the terms ("order") and multiplying
them by constant factors.  ("Constant" meaning "belonging to  K".)
E.g., two factorizations of  (x - 1/4)  are  (x + 1/2)(x - 1/2)  and
(2x - 1)(x/2 + 1/4).  If we take the first factorization, reverse the
order of the factors, and then multiply the first factor by  2  and the
second by its inverse, 1/2,  we get the second factorization; so these
count as "the same up to order and constant factors".

_________________________________________________________________
You ask whether Proposition 2.4 holds for more general rings than  Z.

Yes!  The proof works with  Z  replaced by any unique factorization
domain, and  Q  by its field of fractions.  (Was "unique factorization
domain" defined in your 113?)

But it is not true over an arbitrary integral domain.  For instance,
letting  R  be the domain  Z[2i] = {m + 2ni | m, n \in Z}  that I
mentioned today, one finds that  t^2 + 1  is irreducible in  R[t],
essentially because  -1  is not a square in that ring; but the field
of fractions of  R[t]  is  Q(i),  so  t^2 + 1 = (t+i)(t-i)  there.

_________________________________________________________________
You ask how, in the proof of Lemma 2.2, one can have a divisor of
f  equal to  kf.  I hope what I said in class clarified this:
For  k\in K - {0},  k  is invertible, hence  f = (kf) (k^-1).

You also ask why  f|haf  and  f|hbg  imply  f|h.  This is because
Stewart has just written  h  as the sum  of  haf  and  hbg,  and
if two elements are divisible by  f,  so is their sum.  (Here I
think you should have followed the general guideline:  If you don't
see the reason for the assertion about something, check what
has just been proved about that something, and see whether that is
what the author is using.)

As for where Eisenstein's Criterion is applicable, I will talk more
about it Wednesday.  There is definitely a simpler way to show
that it is applicable to  t^16 + ... + 1  than trial and error!
If  p  is any prime, and we let  f(t) = t^p-1 + t^p-2 + ... + t + 1,
then Stewart shows on p.168 that it is irreducible by Eisenstein's
criterion using that prime  p:  Go to the proof of Lemma 17.9 on that
page and ignore statement of the Lemma and the second sentence of the
proof; the first sentence and the remainder of the proof should make
sense.  As to whether it is applicable to "a large percentage of
irreducibles" -- I doubt it.  Nevertheless, when one needs an example
for some purpose, it often turns up that one can concoct one that
Eisenstein's criterion is applicable to; so it is very useful in that
way.

_________________________________________________________________
You ask about the argument at the top of p.20, that since  p
divides the sum  h_0 g_i+j + ... + h_j g_i + ... h_i+j g_0,  it
must divide  h_j g_i.

It is certainly true, as you say, that  p  can divide a sum without
dividing any of the summands.  But if it divides a sum, and divides
_all_but_one_ of the summands, then it must divide that one summand
as well.  (For that one summand can be written as the whole sum,
minus all the other summands; and since each of these terms is divisible
by  p,  the resulting element will be too.)  Since Stewart has shown
why all the other summands are divisible by  p,  he can conclude that
h_j g_i  is too.

_________________________________________________________________
You ask whether  R  needs to be an integral domain in the definition
of a zero of a polynomial.

No, one can make the _definition_ without assuming  R  a domain.
But for most of the properties of the concept that one wants to
prove, it must be assumed a domain.  And Stewart does assume it
a domain, in fact, a field, when he proves things about it.

_________________________________________________________________
You ask how to find the multiplicative inverse of
p + q (cube-root 2) + r (cube-root 2)^2.

I started to send you an "easy" answer, then realized that it was
not so easy.  I hope to find time to talk about it on Friday.  Anyway,
Here is another argument which, though perhaps not what Stewart had in
mind, does work:

Consider the set  V of elements  A + B (cube-root 2) + C (cube-root 2)^2
(A, B, C \in Q) as a vector space over  Q.  It is finite-dimensional,
since it is spanned by  {1,  cube-root 2, (cube-root 2)^2}.  (If we
had an easy way of showing at this point that those elements were
linearly independent, we could say precisely that it was 3-dimensional;
but we can in any case say it has dimension  _< 3.)

Now if  p + q (cube-root 2) + r (cube-root 2)^2  is nonzero, then
multiplication by that element is a one-to-one linear map  V --> V.
A one-to-one linear map of a finite-dimensional vector-space to itself
is invertible (Math 110!), in particular, onto, so there is some
element of  V  which this map takes to  1;  i.e., an element which
when multiplied by  p + q (cube-root 2) + r (cube-root 2)^2  gives  1;
and that is what we were looking for.

_________________________________________________________________
You ask what is meant by an "inclusion map".

If  T  is a set and  S  a subset, then the "inclusion map" of  S  into
T  means the map  i: S --> T  defined by  i(s) = s  for all  s  in  S.
So it is like an identity map, but instead of going from a set to
itself, it goes from a set to a set containing it.  Unlike an identity
map, it is not in general onto.  One thinks of it as "including"  S
in  T.

You also ask whether the inclusion maps given as examples of field
extensions on p.30 are "any different from his previous examples ...",
but I don't know what previous examples you mean.

_________________________________________________________________
You ask why the field described at the top of p.31 should have the
property stated at the bottom of p.30, of being the _smallest_
subfield of  K  containing  X.

First, I hope you are clear on what being the "smallest" such subfield
means:  it means that it is a subfield of  K  containing  X  which is
contained in every subfield of  K  that contains  X.

To see that the field described at the top of p.31 has the latter
property, note that any subfield of  K  is closed under the field
operations; hence if it contains the elements of  X,  it will also
contain every element that can be gotten from the members of  X
using those operations.

Clear now?

_________________________________________________________________
You ask how to see that statement #2 at the top of p.31 is equivalent
to the definition of the subfield of  K  generated by  X.

When you have a question like this, you should let me know how far you
have been able to get on it, and, if possible, what difficulty you have
going further, so that I know what problems to address.

You need to show that the set in question is equal to the intersection
of all subfields of  K  containing  X.  To show two sets equal, one must
show that each is contained in the other; equivalently, that every
element of the first is contained in the second and that every element
of the second is contained in the first.  (Did you at least reach the
stage of seeing that there were two inclusions to be proved?  I will
assume you did.)  Did you examine each of these inclusions in turn?
If you could see that one was true (e.g., that the intersection was
contained in the set described at the top of p.31), but not the other,
you should say this in your question, and if possible, say what
difficulties you encountered when examining the other needed inclusion.
If you looked at each inclusion and could not see why _either_ was true,
you should say that!

As I wrote in the course information sheet, if you asked such a
question at my office hours, I would question you to see what you
understood and what needed to be explained.  But that is very
time-consuming to do by e-mail, so you should do as much of it as
you can in posing your question.

So let me know about these points, and I will then do my best to help.

_________________________________________________________________
You ask for an example of a simple transcendental extension, noting
that on p.32, Stewart says that  K(t),  the field of rational
expressions, is such an example, but that you don't know what that
means.

A basic survival technique in mathematics courses is:  If you don't
know what a phrase or symbol that the author uses means, check the
index and/or table of notation, and see whether he has defined it!
(In general, I recommend first looking in the preceding paragraphs
of the section you are reading.  But in this case, you wouldn't have
found the phrase there.)

Looking in the index under "rational expression", you would have
found a reference to p.9.  Looking in the Symbol index (pp.197-198),
you wouldn't have found  K(t),  but you would have found  R(t), with
the gloss "field of rational expressions", also referring you to p.9.

So look on p.9!

Does this answer your question?

Of course, if you haven't seen the construction of  K(t)  before, you
need to do a lot of thinking to picture the field described on that
page (lines 3-6) and maybe you will have questions about that, which
I will be glad to help with.  But at least you will be further along
than if you were still stuck on Stewart's example.

I hope you'll remember that basic technique, "Look in the index
and/or table of notations", in the future!

_________________________________________________________________
You ask whether, if  x  and  y  are transcendental,  Q[x]  must
equal  Q[y].

No.  What Stewart says is that a simple transcendental extension
of a field  K  is unique _up_to_isomorphism_.  That means that
Q(x)  will be isomorphic to  Q(y).  (And the same will be true of
Q[x]  and  Q[y].)  But not that they will be equal.

This is related to what I said in class, that we will often "identify"
two isomorphic objects _when_this_is_not_likely_to_lead_to_confusion.
When we are studying the internal algebraic properties of a simple
transcendental extension of  Q,  we can speak as though there were
just one such object, because they all have the same internal
algebraic properties.  But if we are looking at the structure of  R,
we can't safely identify different transcendental subfields, because
they consist of different elements of  R.

_________________________________________________________________
You ask about Stewart's statement that the element  t\in K(t)  is
transcendental because "If ... p(t) = 0,  then  p = 0 ..." (p.35).

Well, if  p(t) = Sigma a_i t^i  is an element of  K[t],  then for
any  alpha  in an extension field  L  of  K,  Stewart has defined
p(alpha)  to mean  Sigma a_i alpha^i;  and we say that  alpha
"satisfies" the polynomial  p  if  p(alpha) = 0.

Now in the special case where  L = K(t)  and we take  alpha = t,  we see
that on substituting  alpha  for  t  as defined above, we get back our
original polynomial  p(t)  (which shows, incidentally, that the
definition of  p(alpha)  and the notation  p(t)  are consistent).
Hence in this case the map "substituting  alpha  for  t" is the
inclusion of  K[t]  in  K(t), and this clearly has kernel  {0}.

_________________________________________________________________
When mathematicians say "identify X with Y", the meaning is something
inbetween "pretend X is Y" and "assume X is Y".  With that in mind,
I hope what Stewart says at the top of p.34 makes more sense.

An "inclusion map" means the map from a subset  S  to
a set  T  that contains it, defined by  f(s) = s  for all  s  in  S.
In other words, it is like an identity map, but if its codomain is
larger than its range we can't call it the identity, so we think of
it as "including"  S  in  T".

_________________________________________________________________
You ask, first, why the situation two lines before the
lower Definition on p.35 is "contrary to the definition".  It is,
as you guessed, because  p  is defined as having lowest degree
among monic polynomials satisfied by  alpha,  but what is described
there would be a monic polynomial of lower degree satisfied by  alpha.

You also ask whether, when one writes "m(alpha) = 0", this means
the zero of the original field, or the zero of the extension field.
It means the zero of the extension field, insofar as one distinguishes
these.  (After all, it is only in the extension field that one can
evaluate  m(alpha).)  In general, one identifies  K  with its image
in the extension field, and then one no longer has to make this
distinction.  But when one constructs  K[t]/I  as consisting of
cosets of  I,  then one has to distinguish the two zeroes, and one
is looking at the zero of the extension field.

You ask how, in constructing a factor-ring  R/I,  one proves that the
map  pi(r) = r + I  is well-defined.  There is no trouble with that!
Well-definedness needs to be proved only when a definition requires
making a choice, and that definition does not.

Where one does have to prove well-definedness is for the _operations_
of  R/I.   E.g., one defines the product of  r+I  and  s+I  to be
rs + I.  But given an element x \in R/I,  there are in general many
elements  r_1, r_2, ...  such that  x = r_i + I = r_2 + I = ... ,
so the definition of multiplication just mentioned requires choosing
one such expression for  r,  and one for  s.  Then one has to prove
that changing either of these choices doesn't change the coset
rs + I.  I suggest you try that verification; it's not very hard.
If you have trouble with it, ask about it.

If  phi: R --> S  is a homomorphism whose kernel contains  I,  you
want to know how one gets a homomorphism  psi: R/I --> S  such that
psi pi = phi.  Well, to see how we _have_ to define it, we consider
what that equation means when applied to an element  r  of  R.
It means
		psi(r+I) = phi(r).
So one uses that equation to define psi.  Namely, every element of  R/I
has the form  r+I, and we define the result of applying  psi  to that
element  r+I  to be  phi(r).  Again, we must prove well-definedness.
Again, I suggest you try, and ask me if you have trouble with it.

You say that "the same Psi is actually an injective homomorphism from
R/Ker(Phi) -> R'".  Not in general.  But if you take  I = Ker(phi)
(rather than letting  I  be any ideal contained in  ker(phi)) then
this is true.

Incidentally, when we write Greek letters in e-mail, "Phi", "Pi"
etc. refer to the capital Greek letters.  But the symbols generally used
for maps are the lower-case letters, so it is best to call them "phi",
"pi" etc..  (See the handout on the Greek alphabet for the difference
in appearance between the capital and lower-case forms.)

_________________________________________________________________
You ask whether in the Definition on p.40, "monomorphism" could
be replaced by "homomorphism".

First, have you noted the definition of monomorphism?  Do you realize
that it _is_ a homomorphism, but just with one additional condition,
namely, of being one-to-one?

Second, have you noted Lemma 3.3 on p.36, which shows that any
homomorphism whose domain is a field is a monomorphism unless it
is the homomorphism that sends every element to  0 ?

With these two facts in mind, you can see that the map  i  that
Stewart starts with is simply any homomorphism of fields, other than
the one that sends every element to  0.  We are not interested
in the map that sends everthing to zero, because it messes other
things up (e.g., it won't satisfy  f(x^-1) = f(x)^-1), so by assuming
i  a monomomorphism, Stewart is really saying "a homomorphism other
than the bad one that we don't want to deal with."  As for the
i-hat of his conclusion, in calling it a monomorphism, he is saying
that it is a homomorphism, just as you would like, but he is also
saying more:  that it is one-to-one (which, since  K[t]  and  L[t]
are not fields, is more than just saying it is not the zero map).

As I mentioned in class, I don't like the fact that Stewart uses
the term "monomorphism" as a way of saying "field homomorphism,
other than the zero map".  I believe that the best definition to
use for a homomorphism of rings with 1 is to require it to take  1
to  1  as well as respecting addition and multiplication.  If we
put this into our definition, then the zero map is not a homomorphism
of fields, and in most of the places where Stewart says "monomorphism",
one could simply say "homomorphism", and the one-one-ness will follow
by Lemma 3.3.

But since he has written the book, and chosen what definitions to
use, we must accept that "monomorphism" is the appropriate term
for him to use for what he is talking about.

_________________________________________________________________
You ask whether, where Stewart writes "... either [M:L] = infinity or
[L:K] = infinity", they can both be infinite.

Certainly!  "Or" is not exclusive.  (If the prerequisite for a course
is "math 110 or math 113", someone who has taken both has certainly
satisfied the prerequisites for the course.  In many situations,
context makes "or" exclusive, e.g., "It must be an animal or a plant",
but that is not a consequence of the meaning of "or".)

You also ask why, in the proof of Lemma 4.4, "any algebraic extension
K(alpha_1,..., alpha_s) : K  is finite", when Stewart has just said that
not every algebraic extension is finite.

Although not every algebraic extension is finite, he is saying here that
every algebraic extension _of_the_form_  K(alpha_1,..., alpha_s) : K
is so.  An algebraic extension of that form is one that can be gotten
by adjoining _finitely_many_ algebraic elements  alpha_1,..., alpha_s
to  K.  (The extenson  A:Q  discussed at the bottom of the page is an
example which cannot be so obtained.)

You suggested that the proper interpretation of what Stewart meant
might be "a polynomial extension of finite degree".  But by the
definition at the bottom of p.47, that is exactly the same as a
finite extension.  So "finite degree" is what we are trying to prove
here, not what we are assuming.  You based that guess on the fact that
Stewart said "let n = [L:K] ...".  But notice that he says this after
the word "Conversely".  Saying "Conversely" is a signal that he is
about to prove the _converse_ of what he has just proved; so the old
conclusion becomes the new hypothesis, and vice versa.  So in the
half of the proof after that word, the condition of being a finite
extension, equivalently, having "finite degree, is assumed, and the
condition of "having the form  K(alpha_1,..., alpha_s)" is then to be
proved.  But the reverse is true before that word.

I hope I don't sound as though I am being over-critical of your
question; my aim is to point out some things about how to read
mathematical writing.

_________________________________________________________________
You ask what Stewart means by the "representative polynomials"
in the last paragraph of p.37.

He means the polynomials chosen in the preceding sentence -- one
representative from each coset, namely, the unique polynomial in
that coset that has degree  < the degree of  m.

Whenever one has a one-to-one correspondence between two sets, any
operations on one of them can be used to induce "corresponding"
operations on the other.  (E.g., since we have a one-to-one
correspondence between integers  1, 2, 0, -1 etc. and number-words
"one", "two", "zero", "minus one" etc., the operations on the
integers, "1+1=2" etc. induce operations on the number-words,
"one + one = two" etc..)  Now here we have a one-to-one correspondence
between cosets and representative polynomials (one representative
polynomial in each coset); hence the operations on cosets induce
operations on the representative polynomials.

You wrote "I think I got the general idea, but ...".  If you were at
my office hours, I could question you on the general idea you had,
and know what I needed to clarify or correct.  Since we aren't face
to face when you send in a question of the day, it would help if you
indicated what you had been able to figure out or guess for yourself,
rather than having me fish blindly.

_________________________________________________________________
You ask how Stewart gets the condition that  phi|_K  is the
identity, at the end of the proof of Theorem 3.6.

Well, he has given a precise definition of how  phi  acts on
any element.  Try applying that definition to the case where  phi
is applied to an element of  K,  and see what it gives.  Write me
again if you have difficult with this.

_________________________________________________________________
You ask about Stewart's statement in the proof of Theorem 3.8,
that every element of  k(alpha)  can be written  x_0 + ...
+ x_n alpha^n  where  n = deg(m) - a,  and ask how we can assert
the exact equality "n = deg(m) - 1" rather than just "n < deg(m)".

If we were requiring  alpha_n  to be nonzero, as in the definition
of the degree of a polynomial, then we could only say "n < deg(m)".
But we are not requiring that; so if we have an expression with
fewer terms, we just fill in zero terms to get the indicated form.

_________________________________________________________________
You ask what Stewart means by saying that the operation which turns
L  into a vector space over  K  "simply forgets some of the structure."

Well, if we know the structure of  L  as an extension of  K,  we know
a lot about it -- it is a set, with operations of addition and
multiplication, and a map of  K  into it, and combining the
multiplication with the map of  K  into it, we can define multiplication
of elements of  L  by elements of  K,  and see that  L  is a
K-vector-space.  But what we know about it is more than a structure of
K-vector-space, since we have a way of multiplying elements of  L
by other elements of  L  even when neither element comes from  K.  If
we forget about how to multiply elements of  L  by general elements of
L,  and just remember how to multiply them by elements of  K,  and how
to add them, are simply left with  the K-vector-space structure.  So
this K-vector-space structure is gotten by forgetting some of the
structure.

As an example,  Q(sqrt 2)  and  Q(sqrt 3)  are non-isomorphic field
extensions of  Q  -- we saw at the end of class today that  Q(sqrt 2)
does not contain any element which when multiplied by itself gives  3,
while  Q(sqrt 3) does.  However, if we forget how to multiply members
of these extension fields by each other, and simply remember their
K-vector-space structures, then we see that they are isomorphic as
K-vector-spaces:  the correspondence  p + q sqrt 2 <--> p + q sqrt 3
is a vector-space isomorphism, though not a field isomorphism.  So in
regarding them a Q-vector-spaces, we have forgotten so much of the
structure that we can no longer tell them apart.

_________________________________________________________________
You ask how the uniqueness clause of lemma 3.7 implies that,
in the situation of Proposition 4.3 the set  {1, alpha, (alpha)^2,
..., (alpha)^(n-1)} is linearly independent.

A set of elements  {x_1, ..., x_n}  of a vector space  V  is linearly
dependent if and only if the expression for  0  as a linear combination
of these vectors is _non_unique.  For "0x_1 + ... + 0x_n" is always
an expression for 0 as a linear combination of these elements, while
linear dependence of these elements means the existence of an
expression  a_1 x_1 + ... + a_n x_n  for  0  in which not all  a_i
are zero; i.e., which is different from "0x_1 + ... + 0x_n".

So if we know that every element of the field can be expressed uniquely
as a linear combination of  1, alpha, (alpha)^2, ..., (alpha)^(n-1),
then in particular,  0  can be expressed uniquely, so that set is
linearly independent.

(In fact, it is not hard to show, using the above and a few similar
arguments, that a subset  B  of a vector space  V  is a basis if and
only if every element of  V  can be expressed uniquely as a linear
combination of members of  B.  Using this criterion, we see that
Lemma 3.7 is precisely equivalent to the statement that  {1, alpha,
(alpha)^2, ..., (alpha)^(n-1)} is a basis of K(alpha)  over  K.)

_________________________________________________________________
You ask whether in a vector space over a general field there is a
danger "that a vector could be linearly dependent with itself (ie if
there are multiple scalars that result in the same answer after
multiplication?)"

I guess you mean, "If  x  is a nonzero vector space over a field  K,
could there exist scalars  a not-equal-to b  in  K  such that  ax = bx?"

No.  The equation  ax = bx  is equivalent to  (a-b) x = 0;  and since
K  is a field and  a  and  b  are distinct,  K  contains an inverse
to  a-b.  Multiplying the above equation by that inverse, we get the
relation  x = 0.  This contradicts the assumption that  x  is nonzero.

(However, if one replaces the field  K  by a general ring  R,  then
the analog of a vector space over  K  is what is called a "module"
over  R,  and what you ask about can definitely happen in that
context, since nonzero elements of  R  need not be invertible.  For
instance, a module over the ring  Z  of integers is the same as an
abelian group, and in abelian groups, relations  n x = 0  are
well-known.  For another example, if  R  is the ring of  n x n
matrices over a field, then one can apply these matrices to column
vectors of height  n  over that field, so the set of such vectors forms
an R-module.  In this module we have  A x = 0  whenever the vector  x
belongs to the null space of the matrix  A.)

(There's also a situation in which something like what you referred
to _appears_ to be happening, but really isn't.  If  K  is a field of
nonzero characteristic  p,  and  n  is a multiple of  p,  then for
any element  x  of any K-vector-space,  nx = 0.  But here  n  and  0
are not distinct elements of  K,  so there is no contradiction.)

_________________________________________________________________
You ask whether it makes sense to talk about C(t) as extension of R(t),
and ask for  [C(t):R(t)].

Yes.  The inclusion of  R  in  C  induces an inclusion of  R[t]  in
C[t],  as noted in the "Definition" (which isn't really a definition)
on p.40.  This in turn induces a map of fields,  R(t) --> C(t),
which can be thought of as an inclusion (sending each rational
function  f(t)/g(t) in R(t) to the same rational function  f(t)/g(t)
in C(t); although strictly speaking, looking at rational functions
f(t)/g(t)  as certain equivalence classes, they are not the same
things.)

So one can ask what  [C(t):R(t)]  is.  In fact, it is  2,  just
like  [C:R].  Can you see how to prove this?

_________________________________________________________________
I hope that the discussion in class answered your question about the
relation between ruler and compass constructions, and subfields.
The ruler and compass operations apply to points of the plane; the
subfields are subfields of  R.  So the points of the plane are not
elements of the fields in question; it is the two coordinates of
each point that we are regarding as field elements.  And it is not
the ruler and compass constructions that give the field operations;
rather, Stewart _defines_  K_i  as the _field_ obtained by adjoining
certain elements to another field; so it is the definition of "field
obtained by adjoining elements to another field" that gives closure
under the field operations.

_________________________________________________________________
You ask about "constructing pi", as in Stewart's discussion in the
first two lines of p.57.

Stewart is not saying that one _can_ construct  pi.  He is saying
that _if_ one could square the circle, then one would be able to
construct  pi!

Reread the Theorem at the bottom of p.56 and the two sentences at
the top of p.57 carefully.  If you still have difficulty (which
at least will not be the difficulty of thinking that Stewart is
saying we really _can_ construct  pi), let me know (or come to
office hours) and I'll try to help.

_________________________________________________________________
You ask how, near the end of the proof of Theorem 5.2, Stewart
concludes that [K_0(x):K_0] is a power of 2.

He has pointed out that  [K_n:K_0(x)] [K_0(x):K_0] = [K_n:K_0],  and
shown that  [K_n:K_0]  is a power of two.  Hence  [K_n:K_0(x)]  is
a divisor of a power of two.  But every divisor of a power of
two is a power of two!

_________________________________________________________________
You ask how to construct parallel lines with ruler and compass.

Given a line  L  and a point  P  not on  L,  drop a perpendicular
L'  from  P  to  L,  then erect a perpendicular  L''  to  L'  at  P.
Then  L''  will be parallel to  L,  and pass through  P.

             |
             |
             |
 ------------*-----------------
      L''    |P
             |
             |L'
             |
             |
      L      |
 ------------*-----------------
             |

_________________________________________________________________
You ask in what sense Plato considered the line and circle the only
"perfect" geometric figures.

I wondered about that myself.  My guess is that the condition they
satisfy that he considered "perfection" is perfect symmetry --
every point is "just like" every other, since one can rotate a circle
to carry any point to any other, and translate a line with the same
result.  This is supported by the interest of the Greeks in regular
polygons and polyhedra; these are polygons and polyhedra which have the
property that every vertex, every edge, and (in the case of polyhedra)
every face corresponds to every other vertex, edge and face under
motions that preserve the figures.

But that's just my guess.

_________________________________________________________________
You ask for a hint on how to do Exercise 5.11.

OK, my hint is:  First figure out how, using the weakened
operations of that exercise, to perform the construction "Given
points  A, B, C,  construct a line through  A  parallel to  BC."
Then, given points  P_0, Q_0, Q_1,  if you want to draw a
circle centered at  P_0  and with radius  Q_0 Q_1,  construct
a parallelogram three of whose vertices are  P_0, Q_0, Q_1,  and
note that the desired circle can be gotten using the weakened
operation described, and two verticles of that parallelogram.

_________________________________________________________________
You ask what the point is, on p.57, of discussing a lot of possible
kinds of geometric construction.

The idea of limiting constructions to ruler and compass is a historical
accident; one could ask what the point is of spending so much effort
on what can be done using that specific set of tools.  We do so because
of the historical interest, but it is certainly also of interest to
ask "What if we changed the set of tools, or the precise description of
what we are allowed to do with them?"  Stewart records some results
on that subject.  They aren't relevant to the rest of this course; just
some notes that the reader interested in the topic could follow up.

_________________________________________________________________
You ask what Stewart means on p.57 by "linearly independent circles".

Good question!  I don't know.  One guess is that if the three
circles have equations  (x-a_1)^2 + (y-b_1)^2 = r_1^2,
(x-a_2)^2 + (y-b_2)^2 = r_2^2  and  (x-a_3)^2 + (y-b_3)^2 = r_3^2,
then the functions  (x-a_1)^2 + (y-b_1)^2 - r_1^2,
(x-a_2)^2 + (y-b_2)^2 - r_2^2  and  (x-a_3)^2 + (y-b_3)^2 - r_3^2
should be linearly independent in the space of polyomial functions
on  R^2.  Another possibility is that he means something like that
the three centers should not lie on a line.  I'll include this in the
comments I send Stewart at the end of the Semester.  If you're very
interested in knowing the answer, I could ask whether a colleague
knows, or e-mail Stewart about this particular question now.

_________________________________________________________________
You ask what the connection implied on p.71 between the method of
studying field extensions in terms of their automorphism groups,
and the Erlanger Programm is.

Well, the Erlanger Programm was to study geometries in terms of
their groups of "automorphisms", so in a loose way the ideas are
the same.  I can't claim that they are exactly the same, but I will
note that one of the differences is only apparent:  Stewart in
this paragraph goes from field structure to group of automorphisms,
but his description of the Erlanger Programm goes the other way,
from the group to the geometric structure (the "structure" being
interpreted as whatever is invariant under the given group).
However, as we see later in this chapter, in Galois Theory one also
goes from a group  H  to what is invariant under the group, namely,
the subfield  H^dagger.

_________________________________________________________________
You ask how the relation  H \subset G  implies that the fixed field
of  G  is a subset of the fixed field of  H  (p.74).

If  x  is an element of the fixed field of  G,  that means it is
fixed by the actions of _all_ elements of  G.  Since  H  is a subset
of  G,  x  will in particular be fixed by all elements of  H,  i.e.,
it will belong to the fixed subfield of  H.

_________________________________________________________________
You ask what the "K" in the next-to-last line before the exercises
on p.75 is.

It should say "Q".

_________________________________________________________________
In connection with Stewart's statement on p.75, last line before
exercises, "So in this case the Galois correspondence is a bijection",
you ask what he means by "the Galois correspondence."

That sentence is definitely poorly worded.  What he wants to say is,
essentially, "So in this case, the maps  *: {intermediate fields} -->
{subgroups} and  dagger: {subgroups} --> {intermediate fields}  are
inverse to one another, and so give a bijection between the set of
subgroups and the set of intermediate fields."

As to whether this must be the case whenever the extension is obtained
by adjoining finitely many elements -- no; example "2." in the middle
of p.73 is a counterexample.

_________________________________________________________________
Concerning Stewart's answer to Ex.5.10(h), you ask why  pi  is not
transcendental over  R

Because it is a zero of the polynomial  t - pi,  which has coefficients
in  R.  (Note that this polynomial does not have coefficients in  Q,
so this argument does not contradict the result "pi  is transcendental
over  Q" which Stewart quotes.)

_________________________________________________________________
You ask why when Stewart adjoins a zero "zeta" of  t^2 + t + 1  to  Z_2
on p.81 he gets a field with four elements, rather than just the three
elements {0, 1, zeta}.

For this, you have to remember the meaning that Stewart has defined
for the word "adjoin".  If you look it up in the index, you will see
that it is defined on p.31, where the field "obtained from  K  by
adjoining  Y" is defined to be, not the set  K \union Y,  but the
_field_generated_by_ that set.

Now the 3-element set that you refer to,  {0, 1, zeta},  would not
be a field; it would not be closed under addition.  So to get the
field it generates, we have to bring in the sum  1 + zeta.

There's a lot more to be said about this example; in particular, that
the exact set of elements in such a field is described by Lemma 3.7.
But the first step is to clarify what you misunderstood.  If after
looking at this construction in the light of the above answer you have
further questions, let me know.

_________________________________________________________________
You ask why Stewart says in part 3 of the Example on p.80 that
t^2 - 2t - 2  splits over  Q(sqrt 3).

Well, in part 2 of that example, he has already observed that the
zeros of  t^2 - 2t - 2  in  C  are  1 +- sqrt 3.  (These are obtained
using the quadratic formula.  We haven't developed that formula as of
this point in this course, but we can always use it and check by hand
that the results we get, when substituted into the polynomial, are
indeed zeros).  Since  1 +- sqrt 3  lie in  Q(sqrt 3),  we can write
t^2 - 2t - 2  = (t - (1+sqrt 3)) (t - (1-sqrt 3))  over that field.

Even if you hadn't noticed the connection with the preceding example,
you should probably have asked yourself "What are the zeros of
t^2 - 2t - 2   in  C ?"  When you figured out the answer, you would
have seen why Stewart had made his claim.

_________________________________________________________________
You ask why on p.81, in the next-to-last sentence of section 8.1,
Stewart says that  t^2 + t + 1  doesn't split in any smaller field
than  Z_2 (zeta).

Well, any subfield of  Z_2 (zeta)  must contain its prime subfield
Z_2.  (To be a field it must contain  0  and  1,  and these always
generate the prime subfield of a field -- in this case, they comprise
it.)  And if the given polynomial is to split over that subfield, the
subfield must contain the zeroes of the polynomial, which are
zeta  and  1+zeta.  So, as it contains  Z_2  and contains  zeta,  such
a subfield must contain the field they generate,  Z_2 (zeta).

_________________________________________________________________
You asked a question about Chapter 6.

But Chapter 6 is not in our reading!  If you look at the list of
readings you will see that it is the one chapter that we skip entirely.

Well, I've looked at the computation you asked about, equation (6.1)
on p.62.  When Stewart writes "Integrating by parts", he is
summarizing a more complicated computation.  In the preceding integral,
write the integrand,  (1 - x^2)^n cos(alpha x)dx,  as  u dv  where
u = (1 - x^2)^n  and  dv = cos(alpha x)dx.  When you apply the formula
for integration by parts, note that assuming  n >_ 1,  the difference
between the values of the term  uv  at  -1  and  +1  is  0,  because
the factor  u  is  0  at both ends.  This leaves us with the integral
of  v du.  This differs in form from the original expression because
it involves a sine rather than a cosine, so we repeat the process.
Since Stewart has assumed  n >_ 2,  the "uv" term in the result still
contributes nothing, while the integral is now of approximately the
desired form.  Precisely, we have  (1-x^)^(n-2)  times an expression
involving a constant term and a multiple of  x^2.  Writing this in
terms of a constant and a multiple of  (1-x^2),  we see that it becomes
a linear combination of  (1-x^2)^(n-2)  and  (1-x^2)^(n-1),  so the
integral becomes a linear combination of  I_(n-2)  and  I_(n-1).  I
haven't checked the details of the computation.  If you want to work
it through, and have further difficulties, let me know.

But please keep track of the sequence of readings for this course, and
remember that your question has to be on the current reading.  If you
want to include with it some question(s) about past material, or even
material we have skipped, that's OK; but it doesn't remove the
requirement of asking a question about the reading for the day.  The
reading for Wednesday is #9.

_________________________________________________________________
You ask about the classification of finite fields.

This is done in Chapter 16.  The result of the classification is much
simpler than the situation for finite groups.

As you can see from the chart on p.xii, Chapter 16 can be read
immediately after Chapter 8; so though we will be reading the book
in order, you can look at that chapter at the end of this week if 
you wish.  If you do and have any questions, let me know.

_________________________________________________________________
You ask what the "zeta" on p.81, line 4 is.

It is a zero of the irreducible polynomial  t^2 + t + 1,  adjoined
to  Z_2  via the construction of Theorem 3.5.

How could you have seen that Stewart meant this?  Note that he wrote
two lines earlier "we must go back to the basic construction of the
splitting field".  This construction was given in the proof of
Theorem 8.1.  Following that proof, you find the relevant step to
be "Using Theorem 3.5 we adjoin  sigma_1  to  K ...".  And looking
back at the proof of Theorem 3.5, you would see how that is done.

_________________________________________________________________
You ask about the last paragraph (middle of p.83) of the proof
of Theorem 8.4, and why Stewart assumes there that  theta_1\in L.

He wants to prove that  L  is normal, i.e., that if it contains one
zero  theta_1  of  f,  then it contains every other zero,  theta_2 of
f.  So in this paragraph, he introduces the assumption  theta_1\in L,
and shows that (as a result of the preceding computations) this implies
that  theta_2\in L.

_________________________________________________________________
You ask whether, if  K  is a field, and  alpha, beta  are elements
of an extension field which are linearly independent over  K,
then we must have have  K(alpha, beta) =  K(alpha + beta).

No.  For a counterexample, let

	    K = Q
	alpha = sqrt 2
	beta  = sqrt 3 - sqrt 2.

Then  Q(alpha, beta) = Q(sqrt 2, sqrt 3),  but  Q(alpha + beta) =
Q(sqrt 3).

But in this case, something _like_ what you stated is true, as one
sees in proving part (b) of Exercise 4.19.

_________________________________________________________________
I'm not sure whether you cleared up to your own satisfaction the
question of why we can rearrange generators, e.g., why if  K
is a field, and  theta_1, alpha_1, ..., alpha_n  are members of some
field  M  containing  K,  then

(1)	K(theta_1) (alpha_1, ..., alpha_n) =
(2)	K(theta_1, alpha_1, ..., alpha_n)  =
(3)	K(alpha_1, ... , alpha_n, theta_1) =
(4)	K(alpha_1, ... , alpha_n) (theta_1).

Intuitively, (1)-(4) are four descriptions of the subfield of  M
generated by the set

(5)	K \union {theta_1, alpha_1, ..., alpha_n}.

(Recall that this subfield can be described as the least subfield of
M  that contains the above set (5), as the intersection of all
subfields of  M  that contain (5), or as the set of elements of  M
that can be obtained from the elements of (5) by field operations.)

In the case of the fields numbered (2) and (3) above, the statement
that they are the subfield of  M  generated by (5) is equivalent to
their definition (p.31).  To establish that the remaining cases can also
be so described, you should verify for yourself that (1) and (4) are
fields which contain the set (5), and that any subfield of  M  which
contains the set (5) must contain (1) and (4).  Therefore they, too,
are equal to subfield of  M  generated by the set (5).

_________________________________________________________________
You ask what Stewart means on p.82, a few lines before the Theorem,
about an extension having "well-behaved" or "badly behaved" Galois
group according to whether "the Galois correspondence is a bijection".

The Galois correspondence actually consists of a pair of maps,
"*" and "dagger", so it would be better if rather than saying it
"is a bijection", Stewart said that those maps were inverse to one
another (and so give a bijective correspondence).

Anyway, you say that he can't mean what he seems to, because for
Q(sqrt 2) : Q,  the correspondence is a bijection, but "the extension
is not normal" because  (x^2 - 2)(x^2 -3)  has one root in that
extension, but doesn't completely split.  Well, in fact that extension
is normal -- if you look at the definition at the top of the page,
you will see that it says that every polynomial _irreducible_over__K_
with a root in  L  splits over  L.  But  (x^2 - 2)(x^2 -3)  is not
irreducible over  Q,  as shown by the fact that you have factored it!

_________________________________________________________________
You ask for examples of non-normal simple extensions that are
significantly different from the book's.

If  f  is any irreducible polynomial over  Q  of degree  > 1  but
having only one real root  alpha,  Q(alpha)  will be simple,
but it won't be normal because it will contain the root  alpha  of
f,  but not the other roots, since they're not real.  One can get
lots of examples of this sort which one can prove irreducible using
Eisenstein's criterion;  e.g., anything of the form  t^3 + a t + b
where  a  and  b  are integers,  a  is positive (so that the polynomial
is everywhere increasing, and can only have one real root), and  a  is
divisible by some prime that also divides,  b,  but whose square
doesn't.  (E.g.,  a = 100,  b = any of  2, 5, 6, 10, 14, 15, 18,
20,  ... ).

Of course, not all non-normal simple extensions have this form;
but because we can use elementary properties of the real numbers
to test for number of real roots, and Eisenstein's criterion to
prove irreducibility, it gives lots of easy examples.

_________________________________________________________________
You asked why Proposition 4.3 was applicable on p.83; specifically,
why  theta_1  and  theta_2  both have  f  as their minimal polynomial.

Remember that the minimal polynomial of any element  alpha  is the
monic common divisor of all polynomials having  alpha  as a zero.
Since  f  has  theta_1  as a zero, it must be divisible by the
minimal polynomial of  theta_1;  but since it is irreducible, it has
(up to units) no divisors other than itself and  1.  So it must (up
to units) equal the minimal polynomial of theta_1  -- and similarly,
that of  theta_2.  So strictly speaking, it is not  f  itself, but
the monic polynomial that one gets by dividing  f  by its leading
coefficient that is the minimal polynomial of  theta_1.  Anyway,
by the same argument, the same polynomial is the minimal polynomial
of  theta_2, so Prop.4.3 (and likewise, Theorem 3.8) is applicable.

_________________________________________________________________
You ask what the concept of separability is good for.

As I noted on Wednesday, inseparability represents an obstacle to our
goal of describing intermediate fields of an extension  L:K  as fixed
fields determined by subgroups of the Galois group:  If  L = K(alpha)
where  alpha  is a pth root of an element of  K,  then all the
roots of the minimal polynomial of  alpha  over  K  are the same,
so no automorphism of  K(alpha)  over  K  can move  alpha,  so
we can't distinguish  K(alpha)  from  K  in terms of what automorphisms
fix its elements.  Although I didn't go into the case of inseparable
elements with more complicated minimal polynomials than  t^p - a,
these have similar problems.  As we will see, the two conditions
"separable" and "normal" are what we need to make Galois theory work.

Galois theory will turn out to give strong information about the
structures of field extensions, so it will not be surprising to
learn that inseparable extensions differ from separable ones in
further ways; but the above should be good enough as a starting point.

_________________________________________________________________
You ask why  e^{2 pi i/5}  etc. are zeroes of  t^4 + t^3 + t^2 + t + 1.

To answer that, I need to know how much you know about complex
exponentiation.  Do you know where  e^{a+bi}  lies on the complex
plane?  Do you know how to multiply complex numbers expressed in
terms of their angles with respect to the real axis and their
magnitudes?  ...

As I say on the course information sheet, if you asked such a
question at my office hours, I would question you to see what you
understood and what needed to be explained.  But that is very
time-consuming to do by e-mail.  If you had either said that you
didn't understand how to raise  e  to an imaginary power, or said that
you understood how to define such an exponential, but didn't see what
special properties that power of  e  would have, or said that you saw
some special properties of that element (and said what they were) but
that you didn't see how they led to its being a zero of the polynomial
-- then I would have known where to start!

So let me know how much you know about the topic, and I'll go from
there!

_________________________________________________________________
You ask about Stewart's writing "tau = v(u) / w(u)" just before the
last display on p.84.  He is using the fact that  tau  has been
assumed to lie in the field of fractions  K  of the polynomial ring
K_0 [u].  Any element of the field of fractions can be written as
a fraction with numerator and denominator in the ring.

_________________________________________________________________
You ask how Stewart deduces near the beginning of the proof of
Proposition 8.6 that  Df = 0.

f  and  Df  share a common factor of degree >_ 1 (Lemma 8.5).  
This means a common factor in  K[t].  (Stewart doesn't make that
explicit in the statement of the Lemma, but is clear from the proof.)
But since  f  is irreducible, the only factors of  f  in  K[t]
are (up to multiplication by units)  f  and  1.  Since  1  does not
have degree >_ 1,  the common factor must be  f;  so  Df  must be
divisible by  f.  But by the properties of formal differentiation,
Df  has degree  < deg f.  What multiples of  f  have smaller degree
than  deg f?  Only  0!

_________________________________________________________________
You ask whether Galois theory can be extended to infinite field
extensions and groups.

I hope I will find time to say a little about this in class in a
few days, but in case I don't, here are some brief facts.

For non-algebraic extensions, I don't think there is any nice
theory.  E.g., consider the simple transcendental extension  Q(u):Q.
This has an automorphism sending  u  to  u+1  (and hence  r(u)  to
r(u+1)  for every rational function  r).  Let  H  be the cyclic group
generated by that automorphism.  It is not hard to show that the fixed
field of that automorphism,  and hence of the group  H,  is  Q.  But
the full group of automorphisms fixing  Q  is much larger than  H  --
it is noncommutative, and  consists of automorphisms taking  u  to
(au+b)/(cu+d)  for all rational numbers  a, b, c, d  with  ad-bc
nonzero.  If one starts with a larger transcendental extension, things
are even worse -- one can get examples similar to the above, but where
the full group of automorphisms is so big no one even knows how to
generated it.

For infinite separable normal _algebraic_ extensions, however, there is
a nice theory, though things are not as straightforward as in the finite
case.  As an example, let  L  be the field generated over  Q  by
the square roots of all positive integers, equivalently, by the square
roots of all prime numbers.  Note that an automorphism of  L  over
Q  will send every square root of a positive integer to itself or its
negative, and will be determined by what it does to the square roots
of the primes (since every positive integer is a product of primes).
So if  alpha  is an automorphism of  L  over  Q,  we can specify  alpha
by specifying the set  S_alpha  of prime numbers  p  such that
alpha(sqrt p) = - sqrt p.  One can prove that for every set  A  of
primes there exists an automorphism  alpha  such that  S_alpha = A;
so automorphisms of  L correspond to subsets of the set of all primes.

It is not hard to see that the automorphisms corresponding to _finite_
sets of primes form a subgroup  H  of the Galois group  G,  and that
the fixed field of  H  is just  Q.  Hence  H-dagger-*  is the whole
group  G,  which is again much bigger than  H.

However, in this case there is a natural way to get  H-dagger-*  from
H.  Note that every element  alpha  of  G  can be "approximated
arbitrarily closely" by members of  H,  meaning that for every finite
subset of  L,  there is a member of  H  which acts the same on all
members of that finite subset as  alpha  does.  This has the
consequence that  G  is the _closure_ of  H  in a certain topology; and
this is representative of the general situation in infinite Galois
theory:  The operators  *  and  dagger  give one a bijective
correspondence between intermediate fields and _closed_ subgroups of
the Galois group with respect to a certain natural topology.

(I have a write-up on the subject in the "graduate course handouts"
page on my web page -- the first item under the "250B" list -- but it
assumes a lot of advanced material.)

_________________________________________________________________
You ask whether the introduction of the concept of linear independence
of monomorphisms means that the set of such monomorphisms forms a
vector space.

No.  The set of _all_ maps from  K  to  L  forms a vector space, and
the monomorphisms are a _subset_ (not a subspace!) of that vector
space.  In fact (as Stewart proves), they are a linearly independent
subset, and a nonempty linearly independent subset of a vector space
can never be a subspace.  (Although a sum of homomorphisms will
be a map  f  such that  f(x+y) = f(x) + f(y),  it will not satisfy
f(xy) = f(x) f(y).  You can check this for the sum of the identity
map of  C  and the complex-conjugation map; or the sum of any
homomorphism with itself.)

_________________________________________________________________
You ask for other examples of the "trick" used to prove Lemma 9.1.

I looked in my file on the subject, which reminded me that one such
example comes up at a more elementary level than this:  The linear
independence of eigenvectors corresponding to distinct eigenvalues of
a linear transformation.

Another case of the method occurs in Stewart's proof of Theorem 9.4,
or, in my handout, in Proof 1 of Lemma 9.6.  But as Proof 2 of that
lemma shows, this can really be done without an independent application
of the trick.

For the application of similar methods to skew fields see Jan Treur,
Separate zeros and Galois extensions of skew fields, J. Algebra 120
(1989), 392-405. 

For a different sort of application, see section 30 of "Cogroups and
co-rings in categories of associative rings", by myself and
A. Hausknecht, where we get a result equivalent to something called
"Sweedler's pre-dual of the Jacobson-Bourbaki Theorem".  The "trick"
is used on the lower part of p.159.  I don't remember whether it occurs
similarly in the paper of Sweedler's we refer to, nor in the results of
Jacobson and Bourbaki that Sweedler was "predualizing"; but you can
follow the references and see.

Lemmas 5.9.1 and 5.9.2 and Theorem 5.9.3 of P.M.Cohn's book
"Free Rings and Their Relations" (2nd edition) use the same method.

_________________________________________________________________
You ask why Stewart says, in the first sentence on p.89, that the
Fundamental Theorem of Galois Theory will require separability and
normality.

Separability and normality are not actually needed for the
equality  H-dagger-* = H  that Stewart mentions in this sentence.
They are, however, needed for the other half of the Fundamental
Theorem, the equation  M*-dagger = M.

_________________________________________________________________
You ask how the proof of Lemma 9.1 on p.90 uses the assumption that
the  lambda_i  are nonzero.

Because the zero homomorphism is excluded, the least  n  for
which an equation (9.1) holds must be greater than  1  (since
a single nonzero homomorphism can't satisfy such an equation).
Hence one can choose distinct homomorphisms  lambda_1  and  lambda_n
to finish the proof.

_________________________________________________________________
You ask why Stewart says, in the proof of Lemma 9.1 on p.90,
that without loss of generality we can assume all  a_i  nonzero.

Because if any of them are zero, we can drop those terms from
the equation, and get a true equation of the same form (9.1),
but with no nonzero coefficients.

Note that this is not quite the same equation we had to begin with.
(It is a linear dependence relation among a smaller set of lambda's.)
This is in fact what mathematicians generally mean when they say
something is true without loss of generality:  that if we have a
case in which it is not true, we can modify it (usually in some
straightforward way) to get a case in which it is true, and such that
proving what we want about that modified case will establish what
we wanted to begin with.  (In this case, it will give a contradiction.)

_________________________________________________________________
You ask how Stewart deduces from the final display on p.84 that  f  is
irreducible.

He began that paragraph by assuming that  f  was not irreducible,
namely that it factored  f = gh.  From this, he obtained the
final display, which led to a contradiction, since the degree of
the first term is a multiple of  p,  while that of the second term
is one more than a multiple of  p,  hence they cannot cancel, hence
the equation cannot hold.  This contradiction shows that the assumption
that  f  was reducible was false.

_________________________________________________________________
As you say, the map  g_i |-> g g_i  of Stewart's Lemma 9.3 will not be
a homomorphism.

You ask whether it would be an automorphism if  G  was abelian and
g  was idempotent.

The trouble with this is that in a group, the only idempotent element
is the identity.  For if  x = x^2,  then multiplying that equation by
x^-1,  we get  e = x.

However, there are several related ways to get homomorphisms or
automorphisms.  (By the way, the word for "homomorphism into itself"
is "endomorphism", so I will use that below.)

(a)  If we replace "group" by "monoid" (defined like a group, but
without the assumption that every element has an inverse), then we
can have nonidentity idempotent elements, and it is true that in
an abelian monoid, multiplying by such an element is a endomorphism.
(More generally, in any monoid multiplying by an idempotent element
in the center is a endomorphism.)  But it still won't be an
automorphism, because whenever  x  is an idempotent element other than
e,  multiplication by  x  is not one-to-one:  x.x = x.e.

(b)  If we take two elements  g  and  h,  and multiply by one on the
right and the other on the left, we still get a bijection,  g_i |->
g g_i h.  In a group, this will be an automorphism if  g = h^-1.  This
automorphism is called conjugation by  h;  you've probably seen it
in  113.

(c)  If we take a group, and define on it a 3-ary operation  <x,y,z>
= x y^-1 z,  and then forget the original group operation, then the
resulting system of a set together with this 3-ary operation is called
a "heap".  Now the original operation  g_i |-> g g_i,  though not a
homomorphism of groups, is a homomorphism of heaps:  <gx, gy, gz> =
g <x,y,z>.

_________________________________________________________________
You ask on what basis Stewart says on p.93, line 7, that  k  is nonzero.

I think that if you look and see how Stewart gets "k", you will have
the answer to that question.  Stewart is not very explicit about that
point; so perhaps your question should really have been, "What is  k,
and what are the elements  z_i \in K_0 ?"

In fact, if you look at lines 3 and 4 of this page, you will see that
the  z_i  are the elements  y_i y_1^-1,  so  k  is  y_1.  And this
is nonzero by the way Stewart chose to arrange the  y's  on the middle
of the preceding page, with all the nonzero ones first.

I hope that the version of this proof that I gave in my handout is
clearer!

_________________________________________________________________
You ask how, in my handout on Theorem 9.4, I determine the K-dimension
of the space of K_0-linear maps  K --> K  to be  m.

I do this in two ways.  One is to first determine its dimension
as a K_0-vector-space to be  m^2,  and then note that its dimension as
a K-vector-space is 1/m of that dimension.  The other, which I sketch
in parenthesis, is based on the fact that every such map is
determined by an m-tuple of elements of  K.

Assuming that it is the first argument that you are asking about,
is what you want to know why the space in question has K_0-dimension
m^2,  or why its K-dimension is 1/m times its K_0-dimension, or both?

As I note on the class information sheet, if you asked this in my
office hours, I would question you to see what you understood, and
what I needed to clarify.  But this is much harder to do by e-mail,
so you should try to be very explicit about what you understand and
precisely what you are asking.

So please let me know, and I will then answer your question.

_________________________________________________________________
You ask about the meaning of "larger" and "smaller" in the second
sentence of the third paragraph of the handout on Theorem 9.4.

In general, mathematicians use "larger" to mean "containing the other"
and "smaller" to mean "contained in the other".  That is what I mean
here.  However, it happens that what I say here also makes sense with
"larger" and "smaller" interpreted as "having larger cardinality" and
"having smaller cardinality", because, as I note in the next sentence,
the method of proof will just be based on the orders of the groups.

(Perhaps I should be more precise about how mathematicians "generally"
use "larger" and "smaller".  "Containing" and "contained in" are a
common case; another is "containing something isomorphic to" and
its reverse, "embeddable in", as when one calls a vector space of
higher dimension "larger" than one of lower dimension, even if neither
contains the other.  One can in fact use the words with reference
to any partial or total order relation on a set, as when referring to
a "larger" or "smaller" real number.  So it would be more accurate to
say that the words refer to whatever partial ordering is natural to
the situation; not in general to cardinality alone.)

_________________________________________________________________
You ask why, in proving Theorem 8.11 of my addendum to Chapter 8,
it is sufficient to show that the set of separable elements is closed
under the field operations.

Well, that means that the set of elements of  L  separable over  K
forms a subfield of  L.

Now we are assuming  L  is generated over  K  by a set  X  of
elements that are separable over  K.  The statement that  X  generates
L  over  K  means that the only subfield of  L  containing  K  and  X
is  L  itself; so in particular, the subfield of elements of  L
separable over  K,  since it contains  K  and  X,  is  L  itself; i.e.,
L  is separable over  K.

_________________________________________________________________
You ask whether, in Lemma 9.1, the elements of  G,  being proved
linearly independent, form a vector space, or a basis of a vector space.

As I said in class, they don't form a vector space, because the sum
or product of two homomorphisms is not in general a homomorphism.
Rather, they form a _subset_ of the K-vector space of all K_0-linear
maps  K --> K.  Dedekind's result proves that this subset is a linearly
independent one.

As to whether this make it a basis "of a vector space" -- well, in a
vector space  V,  if  X  is a linearly independent set, then  X  will
be a basis _of_the_subspace_of__V__spanned_by__X.  But it won't
necessarily be a subspace of  V  itself, unless it spans  V.

_________________________________________________________________
You ask where Stewart gets his "four candidates for Q-automorphisms"
at the bottom of p.93; specifically, how he excludes other possible
arrangements of the coefficients  p, q, r  and  s.

Stewart's four candidates are not gotten by looking at possible
permutations of the coefficients.  (There is no reason why an
automorphism should permute the coefficients.  For instance, if  tau
is a zero of  t^2 - t - 1  in an extension of  Q,  then the nontrivial
automorphism of  Q(tau)  sends  p + q tau  to  (p+q) - q tau .)
Rather, they are gotten by taking the four possibilities for
alpha(omega)  that Stewart has found in the preceding sentence,
namely  omega, omega^2, omega^3  and  omega^4,  and figuring out,
using the definition of homomorphism, how a homomorphism having
each of these properties must act.  For instance, consider  alpha_2,
defined by  alpha_2(omega) = omega^2.  Using the definition of
homomorphism, we see that  alpha_2(omega^n) = (omega^2)^n = omega^(2n).
This immediately gives  alpha_2(omega^2) = omega^4.  It also gives
alpha_2(omega^3) = omega^6,  but  omega^6  is not among the terms
occurring in (9.7).  But since  omega^5 = 1,  we see that  omega^6 =
omega,  so  alpha_2(omega^3) = omega.  Similarly,  alpha_2(omega^4) =
omega^2;  hence  alpha_2  has the form shown in the final display.

The key to seeing what Stewart must mean is the phrase that begins the
sentence, "This gives".  It shows that the possible values for  alpha
must somehow be determined by the list of four possiblities, with
which the preceding sentence ends.  I.e., one must be able to deduce
from the fact that  alpha(omega)  has one of those forms the form that
alpha(p + ... + t omega^4)  has.  One sees that the definition of
"homomorphism" allows one to do so.

_________________________________________________________________
You ask where the first equality in the proof of Corollary 9.5 (p.93)
comes from.

It comes from the tower law.  At this point, you need the make the
tower law an automatic part of your way of looking at degrees of
extensions, so that whenever you see two successive field extensions,
you are automatically conscious that their degrees are related in the
way that law states!

Do you have any question about how the tower law gives that equality?

_________________________________________________________________
You ask what I mean, in the second sentence of the first proof of
Lemma 9.6, by "Choose  (y_1, ..., y_N) \in V - {0}  to minimize the
number of  y_i  which are nonzero;" specifically, whether I am
dropping elements which are zero out of the set.

I'm not sure what you mean.  The first question is whether by "elements
which are zero" you mean "elements of  V  which are zero" or "elements
y_i  which are zero".  In  V,  there is, of course, just one element
which is zero, namely the vector  0 = (0,...,0),  and in writing
"V - {0}" I mean "all elements of  V  which are not zero", i.e., I am
excluding (you can say "dropping")  0  from the set of elements we
consider.  If you mean "elements y_i  which are zero", then no, I am
not dropping them.  Rather, I am saying that from all vectors
(y_1, ..., y_N) \in V - {0},  we are to choose one for which the
number of  y_i  that are nonzero is the smallest.

For instance, if  N=6,  and we look at all elements of  V - {0},  and
ask how many nonzero components each such element has, there might
be some with  2  nonzero components, some with  4,  some with  5  and
some with all  6  components nonzero.  In that case, choosing
an element  (y_1, ..., y_N) \in V - {0}  to minimize the number of
y_i  which are nonzero means choosing an element  (y_1, ..., y_N)  for
which the number of nonzero components is  2  (rather than 4, 5, or 6).

Perhaps you took the word "minimize" to imply making some change in
the N-tuple.  But it doesn't; it just means "make the minimal choice".

Does this help?  Is this what your problem was?

_________________________________________________________________
You ask about Stewart's application of Theorem 10.1 in the last
sentence of section 10.1 (p.98).

To discover what he is doing, look carefully at the statement of
that Theorem.  It concerns a K-monomorphism from a subextension of a
normal extension field into that field.  Now look at the situation
in the proof of the Proposition.  The only normal extension field under
discussion is  L,  so to apply the Theorem, Stewart must be considering
a K-monomorphism from some subextension of  L  into  L.  But  tau
is described as an isomorphism  K(alpha) --> K(beta).  So where does he
have a K-monomorphism into  L ?  Answer:  He must be using the fact
that  K(beta)  is contained in  L  to regard  tau  as a map into  L.

Do you see that, regarding  tau  in that way, the application of the
Theorem gives the desired result?

_________________________________________________________________
You ask whether there were other examples of inseparable irreducible
polynomials than those given on p.84.

Well, Proposition 8.6 shows you that any inseparable irreducible
polynomial has to have certain features in common with that example,
namely that it has to be over a field of nonzero characteristic  p,  and
that the exponents of all powers of  t  have to be multiples of  p.
With those restrictions, you should be able to see that one can do
a lot of variations on the particular example on p.84.  First of all,
though Stewart assumes  K_0 = Z_p  in that example, he doesn't do
anything that requires that particular field; he merely needs
characteristic  p;  so one can start with  K_0  any field of that
characteristic.  Secondly, the polynomial need not have the very
simple form shown.  It is easy to show that  t^(p^n) - u  works
just as well.  Variants such as  t^2p - t^p - u  also work; however,
with the tools we have so far, it is harder to show them irreducible,
so Stewart just gave the simplest case.

_________________________________________________________________
As you suggest, at the very end of the proof of Theorem 10.1 on p.97,
when Stewart says "Therefore  sigma  is an automorphism of  L", this
is synonymous with the statement in the preceding sentence that it
is an isomorphism of  L  with itself.  I think he wanted to get the
word "automorphism" in there so that he could combine it with the
observation in this sentence that its restriction to  K  is the
identity, to get the final conclusion that it is a K-automorphism of  L.

_________________________________________________________________
You ask whether the material we've been looking at is related to
the concept of "lifting" in topology.

I can see only a very loose connection.  When one has an interesting
map of sets  f: X --> Y  (especially where  X  and  Y  consist of
complicated structures of some sort), then given an element  y\in Y,
if we can find an element  x\in X  such that  f(x) = y,  we may call
it a "lifting" of  y  to an element of  X.  So in topology, if
we have a map of spaces  m: S --> T,  and let  X,  Y  be the set
of all paths in  S  and all paths in  T  respectively, then a map
f: X --> Y  is induced, and given a path  y  in  T  one can try to
"lift" it to a path  x  in  S.  In the present chapter, given a subfield
M  of a field L,  we can let  X  be the set of automorphisms of  L,
Y   the set of monomorphisms  M --> L,  and  f: X --> Y  the operation
of restricting an automorphism of  L  to  M  to get a monomorphism
M --> L.  In that sense, Theorem 10.1 concerns the "lifting" of
monomorphisms  M --> L  to automorphisms of  L.

_________________________________________________________________
You say in your answer to your pro-forma question that near the end
of the first paragraph of the proof of Theorem 10.3, p.98, "since  N
is a splitting field for  f  and  f  splits in  P,  therefore  P = N".

But this does not follow directly from the definition of splitting
field.  A splitting field for  f  is a _smallest_ field in which  f
splits.  So the facts that  N is a splitting field for  f  and  f
splits in  P  imply that  P  _contains_  N.  Combining this with the
assumption we have made that  P  is a subfield of  N,  we conclude
that  P = N.

_________________________________________________________________
You ask why Stewart's proof of Theorem 10.3 starts by constructing
a splitting field for  f  over  L,  rather than a splitting field
for  f  over  K.

We want the field we get to contain  L.  If we constructed a splitting
field for  f  over  K,  we could, with some work, show that it contained
a subfield isomorphic to  L,  and so by making identifications, we could
assume without loss of generality that it contained  L;  but Stewart's
way of doing it has it containing  L  from the start.

_________________________________________________________________
You ask whether the normal closure of an extension  L:K  is "truly
unique", and not just unique up to isomorphism.

Actually, there are _three_ kinds of uniqueness one could ask about,
and although the normal closure is not "truly unique", it is unique
in the third sense.

First, why is it not truly unique?  Because if  N  is a normal closure
of  L:K,  then one can create an extension of  L  isomorphic to  N
but not equal to  N,  and it will also be a normal closure.  Loosely
speaking, just take a different set of elements in one-to-one
correspondence with the elements of  N,  and give them the same
field structure as that of  N.

What is the third kind of uniqueness?  If a given extension field  M
of  L  contains a normal closure  N  of  L  over  K,  then  N  is the
_only_ normal closure of  L  over  K  in  M.  So, for instance, given
any algebraic extension  L:K  lying within the field  C  of complex
numbers, there is a unique subfield  N  of  C  which is a normal
closure of  L:K.

_________________________________________________________________
You ask about the equalities  m(alpha) = tau(m(alpha)) =
m(tau(alpha))  in the display on p.99

The first holds, as you say, because  m(alpha) = 0,  and  tau,  as
a homomorphism, must send  0  to  0.

The second holds because tau is a K-homomorphism, and  m  has
coefficients in  K.  Specifically, if  m(t) = Sigma a_i t^i,
then  m(alpha) = Sigma a_i alpha^i;  now apply  tau  to this,
and use the fact that  tau(a_i) = a_i.

_________________________________________________________________
You ask whether one always form normal closures as Stewart does
at the top of p.99 by "adjoining the missing zeroes".

Yes, although this language of Stewart's was vague, because he was
just giving an intuitive summary of what he had done.  Note that in
that example, he was working with subfields of  C,  and since
every polynomial splits over  C,  the "missing zeroes" could all be
found there.  In general, we may not be given an extension
containing our  L  over which every polynomial splits; so we have
to "create" the missing zeroes as in the proof of Theorem 3.5.

You ask in particular about the case of an extension of infinite degree.
Well, first recall that my errata to Stewart involved the insertion
of the word "algebraic" before "extension" in the Definition at the
top of p.82.  So even when we are talking about infinite extensions,
in discussing normality we must restrict attention to infinite
_algebraic_ extensions.  For such extensions, one can show using the
Axiom of Choice that the process of adjoining zeroes as in the proof
of Theorem 3.5 can be "performed infinitely many times", and will give
a field with the properties of the normal closure.  But that is outside
the scope of Math 114.

_________________________________________________________________
You asked about the relation between the results in Exercise "7.8"
and De Morgan's laws.

There's a formal similarity:  In a Boolean ring (or if you only had
this special context, in the set of all subsets of a set  X), the
operation of _complement_ interchanges unions and intersections;
and here, the operators  *  and  dagger  interchange least upper
bound (= subobject generated) and intersection.  But it's hard to
get more in the way of relationship, since in one case one is looking
at one map (complementation) from a set to itself, and here we are
looking at two maps between two different sets.

_________________________________________________________________
You ask how one could have discovered an ingenious proof like that
of Theorem 10.6.

Well, one thing I suspect is that the shortest route to the proof
that one could come up with in retrospect is not the way it actually
developed historically.  And I don't know the historical development
myself.  In particular, there is the question of whether the original
statement of the result looked anything like the present formulation;
and how one could have come up with the present statement.  However,
we'll have to ignore these historical questions.

Taking the result as it stands, I look at it as follows.  We want to
count the possible ways of mapping  L  into  N;  intuitively, of
"fitting a copy of  L  into  N".

How to we study other questions of this sort?  Suppose, for instance,
that we have a cube of side 1, and we also have a right triangle with
two sides of length 1, and want to see how many ways we can paste that
right triangle onto the cube, so that the right angle goes onto one of
the vertices, and the adjacent sides go to two sides of the cube.  We
can first ask where the right angle is to be placed; there are 8
possibilities.  Once we have placed it, we can choose one of the
adjacent edges of the triangle, and ask where it can go; there are
three possibilities, the three edges of the cube that come out of that
vertex.  And once we have placed this, the remaining edge can go to
either of the other two other edges coming from that vertex.  So
altogether there are  8 x 3 x 2 = 48 ways of placing the right triangle
on the cube.

Now we return to our field  L,  and the problem of counting the ways of
embedding it in a normal closure  N.  If we write  L = K(alpha_1,...,
alpha_m),  then we can count the choices for where  alpha_1  will
go, for each such choice count the possibilities for where  alpha_2
will go, and so forth.  The number of choices at each stage is
degree of the element  alpha_i  we are looking at over the extension
generated by the preceding terms; i.e., the degree of the extension
K(alpha_1,..., alpha_i) : K(alpha_1,..., alpha_(i-1)).  The product
of these numbers is the degree of the whole extension  K:L,  so that
degree is the number of embeddings.

In saying "count the choices at each step", I am referring to an
inductive process without using the words "By induction".  Stewart
does formalize the result as an induction; the process of doing so
renders a proof simpler, but sometimes makes it look more mysterious.
As I mentioned in class, he had a choice of using the inductive
hypothesis "at the bottom", "at the top", or "in the middle", and
chose to use it "at the top" (applying it to  L:K(alpha),  rather than,
say  L' : K  where  L = L'(alpha)).  And instead of asking "In how many
ways can we extend a given partial homomorphism?", he said, in effect,
"Let's take a particular extension of our partial homomorphism, and
see in how many ways we can modify it."  One has many choices like
this in writing up a proof, even when the underlying idea is the same.

I don't know whether these comments answer your question ... .

_________________________________________________________________
You ask about the line "Hence we may assume" on the middle of
p.101.  I meant to talk about that, but forgot to put it on my list!
What he means is, "If the result is true whenever  M  is normal, then
it must be true in all cases -- because if we have any old  M,  then
just take its normal closure, and apply the result for normal extensions
to that closure, instead of to  M,  and the result for  M  will
follow.  Hence, it is enough to prove the result for normal  M,  since
as we have just seen, the general result will follow from this case."

This is the kind of reasoning that is frequently meant when
mathematicians say "Without loss of generality, we may assume ...".

_________________________________________________________________
You ask about the sentence preceding the statement of Theorem 10.10.

This is a continuation of the ideas of the preceding paragraph,
i.e., the proof of Theorem 10.9.  In proving Theorem 10.6, where
the extension was assumed separable, Stewart used the fact that
the number of zeroes of the minimal polynomial  f  of  alpha  in a
splitting field was exactly the degree of  f.  In proving Theorem 10.9,
he uses the weaker statement that, without the assumption of
separability, the number of zeroes could still be asserted to be _< the
degree.  In this sentence, he observes that if  alpha  is in fact
_inseparable_, then the number of zeroes will be _less_than_ the degree.
In each case, the same calculation that was used in the proof of
Theorem 10.6 is followed, but with a different statement about the
relation between the degree and the number of zeroes.

_________________________________________________________________
You ask what Stewart means by "order-reversing" on p.104, Theorem 11.1,
statement 2.

Stewart understands the set of intermediate fields and the set
of subgroups of  G  to be partially ordered by inclusion; so
"order-reversing" means "inclusion-reversing", i.e., if the intermediate
field  M  is contained in the intermediate field  M',  then the subgroup
M*  will _contain_ the subgroup  M'*,  and similarly for the operation
dagger.  (These are properties we have seen before; he is just
describing them by a brief phrase here.  Incidentally, have you seen
the concept of "partially ordered set", or "partial ordering" on a
set?  If not, my statement above that Stewart understands the set of
intermediate fields and the set of subgroups to be partially ordered
by inclusion won't mean anything to you; but hopefully what follows
it will still be clear.)

_________________________________________________________________
You ask why, on p.111, Stewart can say the fixed field of  U  is
"clearly"  Q(i sqrt 2).

Actually, all the fixed fields except those of  C  and  E  can be
gotten fairly easily using the following observation:  If  H  is
any subgroup of  G  other than  C  or  E,  then _either_  H  has
no element in which  sigma  appears with odd exponent, _or_  H
contains  sigma^2.  Consider these two cases separately:

(1)  Suppose  H  has no element in which  sigma  appears with odd
exponent.  Then every element of  H,  when applied to any of the
basis elements  i^m xi^n,  (m = 0, 1,  n = 0, 1, 2, 3)  changes it
either to itself or its negative.  (Elements in which  sigma  has odd
exponent send  xi  to  +-i xi,  and hence permute some of these
basis elements, along with sign-changes.)  It follows that the only
linear combinations of the given basis elements that are fixed under
H  are those in which the basis elements that get their signs changed
by some members of  H  have coefficient 0.  In each case, if we list
these basis elements we get an immediate description of  H^dagger.

(2)  Suppose  H  contains  sigma^2.  This element fixes those
basis elements in which  xi  has even exponent, and changes the
sign of those in which  xi  has odd exponent.  Hence every element
of  H^dagger  involves only basis elements in which  xi  has even
exponent.  Now for basis elements in which  xi  has even exponent,
_every_ element of  G  either leaves the element fixed or changes
it to its negative; in particular, this is true of every element
of  H,  so as in (1) we see that  H^dagger  will consist of all
linear combinations of a certain subset of our basis.

_________________________________________________________________
At first I was puzzled by your asking why someone might think that
the situation Stewart refers to at the top of p.115 implied that
each  G_i  was normal in  G.  Then I saw that, as you indicate, Stewart
just refers there to "(13.1)".  Well, that's not what he means -- he
means "(13.1) together with condition 1 following it".  Yet another item
for me to write him that he should clarify!  Thanks for pointing it out.

Do you see that (13.1) together with condition 1 might lead people
to think that all  G_i  are normal in  G ?  And can you see that,
as he points out, it really does not follow?
_________________________________________________________________

You ask whether on p.116, in the proof of statement 2 of Theorem 13.2,
the normality statements follow by writing "G_i N / N =~ G_i".  No --
that isn't true in general!  After all, suppose we have  G_i = N.
Then  G_i N = NN = N,  so in that case  G_i N / N  is trivial.

Rather, the relation  G_i+1 N / N <| G_i N / N  is a case of the
relation  "A/H <| G/H" in part 2 of Lemma 13.1; in this case with
N  in the role of  H,  G_i+1 N  in the role of  A,  and  G_i N  in the
role of  G.
_________________________________________________________________

You ask about the first step in the last display of the proof of
part 2 of Theorem 13.2 on p.116, namely  G_(i+1)N/G_iN =
(G_(i+1))(G_iN)/G_iN.

Because  G_i  is a subgroup of  G_i+1,  we can write  G_i+1 = G_i+1 G_i.
Make that substitution in the numerator of the left-hand side, and
(after an application of associativity) you get the right-hand side.

However, the end of Stewart's proof of part 2 is rather ugly.  Here
is the way I would finish it:  After getting the group into the
form  G_i+1 N / G_i N,  consider the maps

	G_i+1 --> G_i+1 N --> G_i+1 N / G_i N,

where the first is the inclusion and the second is the quotient map.

I claim the composite of these maps is surjective.  Indeed, every
element  x \in G_i+1 N / G_i N  is a coset of an element  g n  with
g \in G_i+1,  n \in N.  But since  N  is contained in the group we are
dividing out by, the coset of  g n  is the same as the coset of  g,
so  x  is the image of  g,  proving surjectivity.

Moreover, the kernel of the composite map contains  G_i.  Hence the
image of the composite map is isomorphic to a factor-group of  G/G_i;
and a factor-group of an abelian group is abelian.

_________________________________________________________________
You ask about Stewart's hint to Exercise 11.4 p.107, and how
maps that move elements of  C  can be relevant to the study of
automorphism of  C(t)  over  C,  which must keep all elements of
C  fixed.

The idea is to consider elements of  C(t)  intuitively as "functions"
on the Riemann sphere (each of which will be undefined at a finite set
of points; e.g., the function  t  is undefined at infinity, while
1/t,  though defined at infinity, is undefined at  0).  Thus "C"
appears in several guises in this problem:  as a subfield of  C(t),  as
the set of values these functions take on, and as comprising all but
one point of the domain-set of these functions.  In terms of the first
and second of these viewpoints, we are not interested in "moving"
elements of  C,  but from the third point of view, we can observe, for
instance, that the automorphism of  C(t)  that takes  t  to t^-1
corresponds to composing each element r(t) \in C(t),  regarded as a
function, with the map of the Riemann sphere into itself that takes
each point  z  to  z^-1  (counting  0  and infinity as inverses).
Likewise, each of the elements of this group is given by composition
with a particular transformation of the Riemann sphere into itself.

It takes some work to make this viewpoint rigorous; for instance,
if one multiplies the elements  t^2, t^-1 \in C(t),  regarded as
functions, one gets a function which is undefined at  0  because  t^-1
is undefined there; but one wants the result to be the function  t;
so one needs some operation of "filling in missing values".  But
ignoring the question of how to formulate things precisely, one can
use heuristically the idea sketched above to help one picture this group
of automorphisms.  This, I think, is all Stewart intends by his hint.

_________________________________________________________________
You comment in connection with the list of subgroups on p.110 that
A  and  B  are isomorphic subgroups of  G,  with  A  normal but not
B,  so that "normality is not preserved under isomorphism".

This true in the sense that you state it, namely that a group  G  can
have two isomorphic subgroups, one of which is normal in  G  and
the other isn't.  But the natural way to look at normality is as
a property of a pair  (G, H)  consisting of a group  G  and a
subgroup  H;  and one should consider two such pairs  (G_1, H_1)
and  (G_2, H_2)  isomorphic if there is an isomorphism  phi:
G_1 -> G_2  such that  phi(H_1) = H_2.  For that formulation,
normality is preserved under isomorphism; i.e., if  (G_1, H_1)
and  (G_2, H_2)  are isomorphic group-and-subgroup pairs, then
H_1 <| G_1  if and only if  H_2 <| G_2.

_________________________________________________________________
You ask about Stewart's statement on p.117, second paragraph, that the
class of soluble subgroups is closed under extensions, by Theorem 13.2.

Well, if  G  is an extension of a solvable group  A  by a solvable
group  B,  then, as discussed in the first sentence of that paragraph,
it has a subgroup  N  isomorphic to  A  such that  G/N  is isomorphic
to  B.  Being isomorphic to  A,  N  will be solvable, and being
isomorphic to  B,  G/N  will be solvable, so by part 3 of the Theorem,
G  will be solvable.

_________________________________________________________________
You ask about the isomorphisms  A_4 / V =~ C_3  and  S_4 / A_4 =~ C_2
on p.115.

Since  |A_4| = 12  and  |V| = 3,  |A_4 / V| = 3,  and any group of
order 3 is isomorphic to C_3.  The analogous argument gives the
description of the other factor-group.

One can, of course, in each case work out the list of cosets and their
multiplication tables, and verify that these are isomorphic to the
multiplication tables of  C_3  and  C_2;  but the above shortcut will
do in this case.

_________________________________________________________________
You ask, in connection with Example 4, p.115, "How does a group
of degree 4 have a subgroup of order 12"?

You might better ask, "I know what is meant by the order of a group,
but what is meant by its `degree'?"

What Stewart means by "degree" is "the number  n  of elements such
that  G  is represented as a group of permutations of that many
elements".  It would be better if he defined the term, but I think
he is taking for granted that whether one knows what "degree" means
for a general group or not, one can deduce from the phrase he uses,
"The symmetric group  S_4  of degree  4", that in this context "degree"
means the subscript  n  appearing on the groups  S_n.

_________________________________________________________________
You ask why Stewart writes  G_0 = N  in the proof of point 3 of
Theorem 13.2.

He has assumed that  G/N  is solvable, so that it has a chain of
subgroups, starting with  1  and ending with  G/N,  with certain
properties.  Now any subgroup of  G/N  has the form  H/N  where  H
is a subgroup of  G  containing  N.  Hence, instead of giving the
i-th subgroup in the abovementioned chain a name like "F_i", he
takes advantage of the above fact to write it "G_i/N", where  G_i
is a subgroup of  G  containing  N.  In particular, since the first
of these subgroups of  G/N  is  1,  when we write it  G_0 / N,  the
numerator "G_0" must be  N.  (Clearly,  1 = G_0/N <==> G_0 = N.)

_________________________________________________________________

You ask why in step 2 of Stewart's proof of Theorem 13.4 (p.119) we
can assume  N  contains  (t^-1 x t)x^-1  .

We have assumed it contains  x.  Hence as it is a normal subgroup, it
contains all conjugates of  x,  including   t^-1 x t.  Multiplying
together the elements  t^-1 x t  and  x^-1,  we see that it contains
the element you asked about.

_________________________________________________________________
I hope my lecture answered most of your questions.  Regarding
the reference to "(1)" at the top of p.118, he means "(13.1)";
but unfortunately, not just the line of text at the bottom of p.114
labeled with that number, but also the conditions "1." and "2."
at the top of the next page.  This seemed too complicated to give
as an "erratum to the text", but I will point it out to Stewart,
and am explaining it to those who ask.

_________________________________________________________________
You ask why Stewart's 4 cases in the proof of Theorem 13.4 really
cover all possibilities.

Although he states them in the form "Suppose  N  contains ...", and
asserts that these four cases exhaust all possible nontrivial normal
subgroups  N,  more is true:  Every nonidenity element  x \in A_n  is
of one of the four forms indicated; i.e., when it is written as a
product of disjoint cycles, either one of these cycles has length >_ 4,
or at least two have length 3, or one has length 3 and the rest have
length 2, or all have length 2.  When stated in this form, is it clear?

_________________________________________________________________
You ask about the statement on p.117 that "every element of  G
generates a cyclic subgroup".

If  x  is an element of  G,  then the subgroup that it generates is
a group generated by one element (namely, x), and a group generated
by one element is called a "cyclic group".  So that group is a cyclic
subgroup of  G.

You're not the only one who asked this.  Is there something about the
way you saw "cyclic (sub)group" defined previously that made it hard
to see this?

_________________________________________________________________
You ask how one proves that a normal subgroup is a union of conjugacy
classes.

It follows from the definitions.  The statement that a subgroup  N
of  G  is normal means that for every  x \in N  and every  g \in G,
the conjugate  g x g^-1  lies in  N.  In other words, whenever  x  is
in  N,  all elements of  G  conjugate to  x  are in  N.  In other
words, for every element  x  of  N,  all the other members of its
conjugacy class are also in  N.  In other words,  N  is a union of
conjugacy classes.

These observations don't even require that we are looking at a
subgroup of  G;  just a subset that is closed under conjugating
by all elements of  G.

_________________________________________________________________
You ask how step 2 on p.119 uses step 1 on the preceding page, when
they involve different permutations "t".

The argument given in step 2 shows that if  N  contains a permutation
whose cycle decomposition includes two 3-cycles, then  N  contains
a 5-cycle, and step 1 showed that if  N  contains a permutation
whose decomposition includes an n-cycle for any  n >_ 4  (so in
particular, if it contains a 5-cycle) then it contains a 3-cycle.
Putting those two results together, we conclude that if  N  contains a
permutation whose cycle decomposition includes two 3-cycles, then  N
contains a 3-cycle.  We don't have to talk about whether we "use a
different  t", because we are not saying "by the proof of step 1",
just "by (the result proven in) step 1"; and the result proven doesn't
specify any  t.

_________________________________________________________________
You ask about formal manipulations involving multiplication of
subsets of groups by elements of the groups.

All the statements you wrote were correct:  (ab)B = a(bB),
aA = aB <=> A = B,  and  aAb = B <=> aA = Bb^-1.  They can be verified
quickly from the definitions.  One can also note that multiplication of
subsets by elements defines an "action" of  G  (in the sense I
spoke about in class) on the set of all subsets of  G.  Your first
equation is one of the two identities one has to verify to show this,
while the second equation follows from the properties of such an
action.  (In any set on which a group  G  acts, whenever  ax = ay  one
can multiply this equation by  a^-1  and get x = y.)  Your third
implication is an instance of the fact that we also have a right
action of  G  on the set of subsets of  G,  and the right and left
actions respect one another:  a(Xb) = (aX)b.

(After one learns that when a group  G  acts on the left a set, one
has  ax = ay ==> x = y,  one has to be careful to note that  ax = bx
does not imply  a = b;  rather, this holds whenever  a^-1 b  is
in the isotropy subgroup of  x.)

_________________________________________________________________
You write that you're having trouble understanding Stewart's
manipulations with permutations.

Aside from reviewing your 113 text, here are two points to bear in mind:

First, remember that Stewart, for some reason, is composing his
permutations as though written on the right of their arguments.  (This
was something mentioned in the list of errata.  Did you note it in
your text?)  So, for instance,  (12)(23) = (132)  in his notation,
because he means "first interchange 1 and 2 and then interchange 2 and
3", rather than the more common interpretation "first interchange 2 and
3 and then interchange 1 and 2", which makes the product  (123).

Second, whenever he mentions a factor that "doesn't move the elements
named" (in step 1, "bc...", in step 2, "y", in step 3, "p" and in step 4
again "p"), one should take it into account in one's calculations, but
it doesn't make any difference in the end.  For instance, in step 2,
when he forms  t^-1 x t x^-1,  then on the one hand, one has to verify
that on  n \in {1,2,3,4},  this element indeed behaves as (14)(23);  but
what if  n  is not in  {1,2,3,4}?  Then applying  t^-1  leaves it
unchanged (still n), applying  x  takes it to  p(n),  applying  t
leaves  p(n)  unchanged, and applying  x^-1  takes  p(n)  back to  n;
so it is unchanged.  Thus,  t^-1 x t x^-1  is precisely (14)(23).

_________________________________________________________________
You ask why Stewart chooses the cycles that he does in the proof
of Theorem 13.4.

I guess you mean the cycles  t  (and not the choice of cases 1-4 for
the cycle-decomposition of  x).  I hope you recall the discussion
I gave of conjugation last Wednesday.  In particular, if  x  is a
permutation, then  t^-1 x t  will be a permutation that acts "like"
x,  but where the details about _which_ elements are at which points
of which cycles gets modified by the action of  t.  (For a more detailed
statement see the handout on the same result accessible throught my
web page.)  Hence, if  t  moves _few_ elements, then  t^-1 x t  will
agree with  x  in what it does on most elements, so  (t^-1 x t) x^-1
will agree with the identity in what it does on most elements, i.e.,
it will move few elements, bringing us close to what we want, a
lone 3-cycle.

So the strategy is to choose  t  that simultaneously (a) moves few
elements, (b) belongs to  A_n  (so we can't use a 2-cycle), (c) can be
expressed in terms of the form we have assumed  x  has (e.g., in case 1,
in terms of  a_1 , ... , a_m ) so that we can compute  (t^-1 x t) x^-1
explicitly, and (d) doesn't actually commute with  x,  so that
(t^-1 x t) x^-1  gives us a nonidentity element of  N.

In each of Stewart's cases (1)-(4), if you look for an element
satisfying all of the above conditions, you are likely to come up
with something close to what Stewart gives.  (It takes a little
experience to see when (d) will hold.  The condition is equivalent
to saying that when you modify  x  by conjugation by  t,  you don't just
get  x  again.  If you understand how to compute easily the conjugate
of a permutation by another permutation -- again see the beginning of
my online note if that isn't clear -- this is probably the easiest form
of that condition to test).

If you have time, experiment and see!

_________________________________________________________________
You ask whether in Lemma 13.6 (p.120) one can replace "(12)" with
"(1m)" for arbitrary  m \in {2,...,m}.

Nope!  Notice that if we label the 4 vertices of a square consecutively
as "1, 2, 3, 4" and regard the symmetry group   D_8  as a group of
permutations of those four vertices, then  D_8  contains the rotation
(1234) and the reflection (13); but it is not all of  S_4.

It is an interesting exercise to figure out for which values of  m
S_n  _is_ generated by  (12...n)  and  (1m) !

_________________________________________________________________
You ask how Stewart uses the First Isomorphism Theorem to get
"|MT| = |M| |T| / |M \intersect T|" near the bottom of p.123.

Since  A  is abelian, all subgroups are normal.  Hence we can apply
the First Isomorphism Theorem with any two subgroups in the roles
of the "A" and "H" of that theorem; let us use  M  and  T.  Then the
Theorem (Lemma 13.1, part 1) says  M / M \intersect T =~ MT / T.  Taking
the orders of both sides, we get  |M| / |M \intersect T| = |MT| / |T|.
Solving for  |MT|  now gives the equation claimed.

(Intuitively, the idea is that when we multiply all elements of  M
by all elements of  T,  we get  |M| |T|  products; but some of those
products are equal; namely, given a product  x y,  we can get another
product  (x g)(g^-1 y)  equal to it for any  g  in  M \intersect T.
So to evaluate  |MT|  we have to divide  |M| |T|  by the number of
such elements  g,  i.e.,  |M \intersect T|.  This counting argument can
be made precise, and works even when neither  M  nor  T is normal,
so that  MT  is just a set of elements, not necessarily a subgroup.)

_________________________________________________________________
You ask how we know  G_i <| G_i+1  in the proof of Corollary 13.11.

By the preceding Lemma,  G_i  is normal in  G;  this automatically makes
it normal in any intermediate group.

_________________________________________________________________
You ask how Stewart knows, near the bottom of p.123, that the
order of  t^(r/p)  is  p.

In general, if an element has order  mn,  then its n-th power has
order  m.  If you recall that the order is the least integer such
that this power of the element gives the identity, this should not
be hard to see.

_________________________________________________________________
You ask what inductive hypothesis Stewart is assuming when he says
that Lemma 13.13 (p.123) will be proved "by induction on |A|".

He is assuming that for all abelian groups  B  of orders smaller
than  |A|,  the result of the Lemma holds.

_________________________________________________________________
You ask what Stewart means by the statement in Theorem 13.12 that
"All such subgroups are conjugate in  G".

He means that for any two such subgroups  A  and  B,  there exists
an element  g\in G  such that  g A g^-1 = B.

_________________________________________________________________
You ask whether in the definition of "radical extension" on p.128,
the "m" must be finite.

Yes; whenever a "list" of the form "x_1, ..., x_n" is shown, this is
understood to imply that  n  is a nonnegative integer.  This is a
convention you should learn to take for granted.

One could easily define a not-necessarily-finite radical extension
to be an extension  L:K  such that for every  x\in L,  there is
a radical sequence  alpha_1 , ... , alpha_m  of elements of  L
(defined as on that page, but without the assumption that they
generate  L)  such that  x\in K(alpha_1,...,alpha_n).  But for the
purposes of studying what equations can be solved in radicals, it is
enough to have the concept of a finite radical extension, so that
is what Stewart defines "radical extension" to mean.

_________________________________________________________________
Yes, the extensions you name are examples of extensions to which
Lemma 14.4 applies.

It is reasonable to expect an author to give examples when a new
concept is introduced, and to give applications of a result proved
when these applications are not obvious.  But when a result concerns
concepts the reader is familiar with, it is reasonable to expect the
reader to apply the result to these concepts by his or herself.  The
obvious way to get a field in which  t^n - 1  splits is to take the
splitting field of that polynomial over  Q,  namely  Q(*zeta_n), and
an example of a splitting field of an equation  t^n - a  over such
a field is  Q(*zeta_n, a^1/n).

So -- glad you found those examples; I hope that looking for basic
examples will be an automatic part of reading mathematics for you,
and that you won't be surprised at authors' taking it for granted that
you will do this.

_________________________________________________________________
You ask why, as stated at the bottom of p.130, we can "insert extra
elements" to get all the  n(i) prime.

Well, as an example, suppose that at the first step, the element
alpha_1  we adjoined satisfied  (alpha_1)^12 = a_1,  for some
a_1 \in K.  Then let us insert the terms  (alpha_1)^6  and  (alpha_1)^2
before it.  Now  (alpha_1)^6  has the property that its square is
in  K  (since its square is  (alpha_1)^12 = a_1), and the next term,  
(alpha_1)^2  has the property that its cube is in the extension we
have just formed,  K((alpha_1)^6),  and finally  K(alpha_1)  has the
property that its square is in the second extension,  K((alpha_1)^2).
So we've replaced the single step of adjoining a 12th root by successive
steps of adjoining a square root, a cube root, and a square root, where
the exponents  2, 3, 2  are prime.  I hope the general argument is
clear from this example.

_________________________________________________________________
You ask why, on p.131 line 10, when Stewart sets  epsilon =
alpha_1/beta,  he can assert that  epsilon^p = 1.

Remember that  alpha  is a zero of  t^p - a  for some  a\in K  (first
line of the page), and  f  is the minimal polynomial of  alpha_1;
hence it is a divisor of  t^p - a.  Hence  beta,  being another zero
of  f,  is also a zero of  t^p - a.  Hence  (alpha_1/beta)^p = a/a = 1.

_________________________________________________________________
You ask how one can show that Q(sqrt 2) is not isomorphic to  Q(sqrt 3).

That is actually a case of what I cover in the handout on
"Extensions of the form  K(sqrt alpha_1 , ... , sqrt alpha_n) : K".
Part (iii) of the main theorem of that note describes which elements
of  K  have square roots in  L.  Applying it to the case where  K = Q
and  L = Q(sqrt 2),  it says that the only rational numbers having
square roots in  Q(sqrt 2)  are those that are squares of rational
numbers, and those that are  2  times squares of rational numbers.
Since  3  is neither (i.e., neither  3  nor  3/2  is a square in  Q)
it follows that  3  does not have a square root in  Q(sqrt 2).  Since
it does have a square root in  Q(sqrt 3),  these are not isomorphic
(as extensions of  Q,  and hence as fields).

However, if you don't want to use the whole inductive proof of that
theorem, you can just isolate the key argument as it applies to this
case:  Suppose that  (a + b sqrt 2)^2 = 3.  Expanding, and recalling
that  {1, sqrt 2}  is a basis of  Q(sqrt 2) : Q,  we conclude that
the coefficient of 1 and the coefficient of  sqrt 2  must be the
same on both sides of that equation.  Looking at the coefficient of
sqrt 2,  you immediately get  ab = 0,  so  a = 0  or  b = 0,  and
it is easy to get a contradiction in either case.

_________________________________________________________________
You ask why on p.132, on the 4th line of the proof of Theorem 14.1,
N:K_0  is radical.

I assume that you followed Stewart's reference to Lemma 14.2 and
saw that this makes  N:K  radical.  Now  K_0  contains  K,  and
we see that a "radical sequence" for  N:K  will also be a radical
sequence for  N:K_0

_________________________________________________________________
You ask whether the possibility that a polynomial has multiple zeroes
would affect the concept of the Galois group of a polynomial as a
group of permutations of the zeroes, as discussed in the last paragraph
of p.132.

Yes and no!  If we write the factorization of the polynomial over
the splitting field as  (t - alpha_1) ... (t - alpha_n),  then
if some of these alphas are the same, we can't associate to each
automorphism a specific permutation of the symbols  alpha_1 , ...,
alpha_n  (because different _symbols_ can represent the same zero).

However, if we specify that the _distinct_ roots should be
alpha_1, ... , alpha_m  (where  m  may be less than the degree
of the polynomial), then we can associate to each automorphism a
specific permutation of the symbols  alpha_1 , ..., alpha_m,  and
hence a permutation of  {1,...,m}.

_________________________________________________________________
You ask whether the method Stewart uses to get a polynomial over  Q
that is not solvable by radicals in Theorem 14.8, p.134, can also be
used to get such a polynomial over the real numbers.

Nope!  By the reasoning Stewart gives on the preceding page, every
polynomial over the real numbers has a splitting field in the complex
numbers.  But the extension  C:R  has no intermediate fields other than
R  and  C,  so the splitting field of every polynomial over  R  is
either  R  or  C.  These extensions have Galois groups  Gamma(R:R) = 1
and  Gamma(R:R) = Z_2,  both of which are solvable.

Now Stewart's proof of Lemma 14.7 is valid, step by step, with "R" in
place of "Q".  Why is there not a contradiction?  Because that Lemma
applies to an irreducible polynomial of degree p; but there is no
irreducible polynomial of degree  > 2  over  R.  In Theorem 14.8,
Stewart shows that  t^5 - 6t + 3  is irreducible over  Q,  but his
proof uses Eisenstein's criterion, which we proved only for  Q.
(Eisenstein's criterion can be adapted to a much larger class of
fields, but definitely not to all fields.)

Can you see how to prove the fact I stated above, that "there is no
irreducible polynomial of degree  > 2  over  R" ?  (It follows from
other facts mentioned in this e-mail.)

_________________________________________________________________
You ask about the trigonometric solution of the cubic that Stewart
refers to on p.135.

When he calls it "well-known", he doesn't mean that most mathematicians
know it, and that you are expected to know it!  Just that most
mathematicians know that it exists, and that it can be found in books.

Anyway, the idea is as follows.  Note the third-from-last display
on p.56:

	cos(3 theta) = 4 cos^3 (theta) - 3 cos (theta).

This can be used to solve certain cubics, in the following way.
Suppose one has an equation to solve of the form

	A = 4 x^3 - 3x

where  A  is a known real number between -1 and 1, and  x  is the
unknown.  Find in a trig table an angle  phi  such that  cos phi = A.
Then letting  theta = phi/3  in the above formula about cosines, we get

	A = 4 cos^3 (theta) - 3 cos (theta).

so  cos(theta)  is a solution to the equation we wanted to solve.
Moreover, if we have one value of  phi  that satisfies  cos phi = A,
then obviously  phi + 2pi  and  phi + 4pi  will also have that
property, but  cos(phi + 2pi)/3  and  cos(phi + 4pi)/3  will in general
be different from  cos(phi),  so this method gives not one, but three
solutions.

In fact, it is not hard to show that the condition I had to assume
to make this method work, namely  -1 _< A _< 1,  will hold if and
only if the equation  A = 4 x^3 - 3x  has three real solutions.

Now if we have a general cubic equation  t^3 + a t^2 + b t + c
with three real zeroes, we can make a linear change of variables
that brings it to the form  A = 4 x^3 - 3x.  Hence the above
method becomes applicable.

_________________________________________________________________
You ask about Stewart's statement in the proof of Lemma 14.7 on p.133
that the zeros of  f  are distinct because the characteristic is 0.

On p.83 he defined an irreducible polynomial to be separable if its
zeros (in a splitting field) were distinct, and in Proposition 8.6
(p.86) he shows that any irreducible polynomial over a field of
characteristic 0 is separable.  That is the justification for the
facts stated here.

You also ask whether, when he refers to "the characteristic", he means
the characteristic of  Q.

Whenever one field is contained in another, they have the same
prime subfield (see p.3), and hence the same characteristic (p.4).
So whenever we are dealing with a family of fields containing a
common subfield, we can speak of "the characteristic" which will
be the same for all the fields in question.

_________________________________________________________________
You ask how the concept of transitivity of a permutation group (p.135)
is related to that of transitivity of an equivalence relation.

It isn't.  But in both cases, the idea of the word "transitive"
is that of being able to "get from here to there".  In the
equivalence relation case, it refers to the property "if you can
get from X to Y and from Y to Z then you can get from X to Z".
In the group case, it simply means "You can get from anywhere to
anywhere else".

_________________________________________________________________
You ask whether, in the term "general polynomial" introduced on
p.139, the statement that the coefficients "do not satisfy any
algebraic relation" simply means that they are transcendental elements.

It means that and more!  To see example of a polynomial whose
coefficients are transcendentals, but which is _not_ a "general
polynomial", let  x  be transcendental over  Q,  and consider
the polynomial  t^2 + at + b,  where  a = x  and  b = x^2.
Each of these coefficients is transcendental over  Q,  but
nonetheless, the coefficients satisfy an algebraic relation,
namely  a^2 - b = 0;  so this is not a "general polynomial".
On the other hand, if we form a polynomial ring  Q[x,y]  in two
indeterminates, then  t^2 + xt + y  is a "general polynomial".

Stewart's introduction is merely intended to give a general idea.
He makes the precise definition in the middle of p.143, in the
next reading.

You also ask "How are general polynomials easier to work with
than polynomials over Q?"  I hope that the preview I gave of
the next reading partly answered that.  But aside from their
being, in certain ways, easier to work with, their importance
comes from the idea I sketched last time, that if we can find a
formula for the zeros of a "general polynomial", we may be able
to "substitute" values in the base field for the coefficients, and
so get a formula for finding the zeros of arbitrary polynomials.

_________________________________________________________________
You ask what Stewart means by a "nontrivial polynomial" in the
third line of the Definition at the top of p.140.

He means a _nonzero_ polynomial.

(He is confusing two phrases, "nonzero polynomial" and
"nontrivial polynomial equation".  An equation  p(t_1,...,t_n) = 0
is called "trivial" if  p  is itself the zero polynomial, since in
that case this equation holds automatically, and doesn't tell
us anything about  t_1,...,t_n.  If the equation is not trivial,
it is called "non-trivial".  So the equation is non-trivial if
and only if the polynomial is nonzero, hence his mistake is easy
to make.)

_________________________________________________________________
You ask whether the concept of "independent indeterminates" is
related to that of linear independence.

Where Stewart speaks of "independent indeterminates", my handout
uses the more standard phrases "algebraically independent elements".
In the last couple of pages of the handout, I discuss at length the
analogy between algebraic independence and linear independence.

There is one concrete relation between them:  alpha_1,..., alpha_n
are algebraically independent if and only if the set of all
(infinitely many) monomials  alpha_1 ^m_1 ,..., alpha_n ^m_n  are
linearly independent.

_________________________________________________________________
You ask what I mean by "composing with this isomorphism" near the
top of p.2 of the handout.

I should have made this more precise.  If

	i: K[beta_1,...,beta_s] --> K[t_1,...,t_s]

is an isomorphism, then it induces an isomorphism

	i': K[beta_1,...,beta_s][t_s+1] --> K[t_1,...,t_s][t_s+1].

Composing with the substitution map  K[t_1,...,t_s,t_s+1] --> L  we
get a map

	K[beta_1,...,beta_s][t_s+1] --> K[t_1,...,t_s][t_s+1] --> L

which will have nontrivial kernel if and only if substitution map does
(since the map we are composing with is an isomorphism).

_________________________________________________________________
You ask whether, in the proof of Lemma 15.1 in the handout, the fact
that  beta_i  is algebraic over  K[beta_i(1),...,beta_i(s)]  doesn't
follow directly from non-one-one-ness of the substitution map.

Well, it follows from that non-one-one-ness _together_ with the
fact that on the subring  K[t_1,...,t_s],  the map is one-to-one.  Not
being careful of that distinction is what leads to the erroneous proof
of Lemma 15.2 in Stewart, so I take pains to make clear how
one-one-ness on the subring is used in proving the result of this
lemma.

_________________________________________________________________
You ask how the inequalities that Stewart obtains in the proof of
Lemma 15.3 (p.142) lead to the conclusion that  F = K(s_1,...,s_n),
and exclude the possibility that it is larger.

Well, by the tower law,  [K(t_1,...,t_n) : K(s_1,...,s_n)] =
[K(t_1,...,t_n) : F] [F : K(s_1,...,s_n)].  The first term on
the right-hand side is  n!,  and if  F  were larger than
K(s_1,...,s_n)],  the second term would be  > 1,  so the product
would be  > n!.  But Stewart has shown the left-hand side  _< n!,
giving a contradiction.

_________________________________________________________________
You ask what Stewart means by saying on p.143, 6 lines from the
bottom, "The  s_i are now the elementary symmetric polynomials in
t_1, ... t_n."

Whenever a monic polynomial  t^n - a_1 t^n-1 + ... +- a_n  can
be factored as  (t - alpha_1) ... (t - alpha_n),  the coefficients
a_i  can be expressed as symmetric polynomials in the zeros  alpha_i;
e.g.,  a_1 = alpha_1 + ... + alpha_n.  So he means here that the
indeterminates  s_i  can be expressed in this way in terms of the
zeros  t_i  of the polynomial having them as coefficients.

Hmm, I see that this becomes confusing if one tries to write it
out.  In the notation of chapter 2, the general relation between
coefficients and zeroes is expressed by the formulas  a_i =
s_i (alpha_1 ,..., alpha_n).  But if we want to apply these formulas
in the present situation, "s_i" stands for both the field element
and the polynomial operation; so there is no good way to write what
is meant.  Another good reason for using "u_i" and "v_i" instead
of "s_i" and "t_i".

_________________________________________________________________
You ask why the "general polynomial over  K" is so called, when
it is not a polynomial over  K.

Perhaps a better name for it would be "the general polynomial of degree
n _for_  K", or something of the sort.  Anyway, it is a construction
where, starting with any field  K,  and any positive integer  n,  one
gets a certain polynomial, which is essentially uniquely determined
by  K  and  n  ("essentially" meaning that if we take two versions
of it, there is an isomorphism between the fields over which they
are defined, making one polynomial correspond to the other).  The
property of being "general" (having algebraically independent elements
for coefficients) which it has over  K  it does not have over
K(s_1,...,s_n)  (s_1,...,s_n are not algebraically independent over
that field, since they are in it), so it wouldn't be appropriate
to call it the "general polynomial over  K(s_1,...,s_n)".

Maybe "general polynomial over  K" really is the best term; it's
just a different sense of "over".  A polynomial over  K  is not
in  K,  but it is in something constructed from  K,  namely  K[t].
Similarly, a general polynomial is also characterized as being
in something constructed from  K  -- but something different
from the object used in defining "polynomial over  K".  It's a
question of whether one wants to allow the word "over" to have many
related meanings, or try to coin different words for each of these
meanings.

_________________________________________________________________
You ask about generalizing Theorem 15.10 (p.146) to positive
characteristic, if the extension is separable.

Inseparability isn't the problem -- in fact, an inseparable extension
arises by starting with a separable extension of characteristic  p,
and adjoining p-th roots, which we can consider "radicals".

The problem is that in characteristic  p,  there cannot be a primitive
p-th root of unity, i.e., an element of multiplicative order  p,
because the polynomial  t^p - 1  factors as  (t-1)^p,  so in
any extension field, its only roots are  1.  However, one can
investigate what interesting sort of element will generate a separable
normal extension with Galois group of order  p  in that characteristic,
and get a nice answer; and if we modify the definition of "radical"
to include this kind of element, then the same result holds.  Stewart
mentions this very briefly in the Remark on p.147.  If we don't make
this modification, the theorem still holds for finite normal extensions
whose Galois groups are solvable and of order not divisible by  p.

_________________________________________________________________
You ask about the implication on p.147, "phi  is a monomorphism ==>
Gamma(N:M)  is isomorphic to a subgroup of  Gamma(L:K)".

A one-to-one homomorphism of algebraic objects can in general be
regarded as an isomorphism of the domain with the image of the
map, which is a subobject of the codomain.  I.e., if we look at
phi  as a map from  Gamma(N:M)  to the subgroup  phi(Gamma(N:M))
of  Gamma(L:K),  it will be one-to-one because  phi  was to
begin with, and onto because  phi(Gamma(N:M))  is its image;
hence it will be an isomorphism.

_________________________________________________________________
You ask about the strategy of pp.148-149, and in particular, why
we look for elements invariant under the permutation group and its
subgroups.

The first thing to be aware of is that there are two different
points of view here.  One of them is "We form the rational function
field  F = K(s_1,...,s_n)  generated by  n  independent transcendentals,
consider the splitting field  L  over  F  of the polynomial
t^n - s_1 t^n-1  + ... +- s_n,  let  t_1, ..., t_n  be the zeroes
of this polynomial in the splitting field, and try to figure out
formulas for the t's in terms of the  s's, using field operations
and radicals."  The other point of view is "We form the rational
function field  L = K(t_1,...,t_n)  generated by  n  independent
transcendentals, let the permutation group  S_n  act on this field
by permutations of the t's, let  s_1,...,s_n  denote the elementary
symmetric polynomials in the t's, which we have proved generate the
fixed field  F  of the action of  S_n,  and (again) try to figure out
formulas for the t's in terms of the  s's, using field operations
and radicals."  Now from Lemma 15.3 and Theorem 15.6 (which you should
review if you have not fixed the facts proved there in your mind)
the two situations just described are exactly the same!  So as you
read this section, you should keep both viewpoints in mind, and if
a statement doesn't seem to make sense from one point of view, see
whether it does from the other.

The strategy used in the section in the cases of degrees n = 2 and 3
is to find expressions in the t's which are invariant under the
permutation group  S_n, and from which the t's can be recovered
using radicals.  The fact that the expressions are invariant under
S_n  means that they lie in  F = K(s_1,...,s_n),  so that we can
figure out expressions for them in terms of the coefficients  s_i;
the fact that the t's can be recovered from them using radicals means
that when we do so, we get formulas expressing the zeros of our
polynomial in radicals.  (As for the choice of particular expressions
to use, this is motivated by the proof of Theorem 15.8, as I indicated
in class.)

When we come to degree 4, the idea is the same, but we do it in two
steps:  From the observation  {1} <| V <| S_4  we get a tower
L \contains V-dagger \contains F,  and we first find a nice set
of generators for  V-dagger  over  F  by looking at polynomials
invariant under  V,  then work out how to generate  L  over  V-dagger.
(We really did n = 3 in two steps too, using  {1} <| A_3 <| S_3;
but the steps came so close together that one could see the light
at the end of the tunnel as one was coming into the tunnel.)

_________________________________________________________________
You ask how, on p.150, Stewart gets the formulas  y^3 + z^3 = -27 q,
y^3 z^3 = -27 p^3.

Well, have you tried calculating  y^3 + z^3,  verifying that it
is a symmetric polynomial in  t_1, t_2, and t_3,  expressing it
in terms of the elementary symmetric polynomials, and then noting
what happens when, as a result of the Tschirnhausen transformation,
s_1  is set to zero?

As for  y^3 s^3,  it is easier to start with  yz,  which as I
pointed out in class, is also symmetric, figuring out how to
express it, and then cubing the result.

_________________________________________________________________
You ask about Stewart's use of the phrase "Frobenius monomorphism"
on p.156, and whether "monomorphism" is a term of his own invention.

He definitely did not invent the word.  It was invented, I believe
by Bourbaki, to mean "one-to-one homomorphism", essentially because
French does not have an easy way to make "one-to-one" into a modifier.
Likewise, they coined "epimorphism" (in French these words end in
"-isme", and "epi-" has an accent on the "e"; but I'll use them here
in their English forms) to mean "onto homomorphism".  They were then
taken into English, even though we don't really need them.

At this point the story becomes complicated.  When Category Theory
(which I can't explain here) was created, its creators wanted to
come up with abstract category-theoretic versions of various concepts
from traditional mathematics.  They found a pair of mutually dual
properties one of which in _most_ classical cases, characterized
one-to-one maps, so that they named it "monomorphism", while the dual
concept in many cases likewise characterized onto maps.  In many other
cases, it did not, but the classes of maps that it did characterize
turned out to be of great interest nonetheless.  Being dual to
"monomorphism", it had to be called "epimorphism", despite the fact
that it's meaning frequently didn't match Bourbaki's original meaning.

Well, the category-theoretic usage has become well-established and
important, so each of these words now have two common meanings, which
for some classes of mathematical objects agree, and for others don't.
My preference is to avoid using these two words except where one means
the category-theoretic concept, as distinct from the classical concept,
which one can express explicitly.  (Even in French, one can express
the classical concepts by the phrases "homomorphisme injectif" and
"homomorphisme surjectif".)

Now back to Stewart:  He uses "monomorphism" in the classical sense
of "one-to-one homomorphism"; but why does he continually say
"monomorphism of fields" where one would simply expect "homomorphism
of fields"?  Every field homomorphism between fields is one-to-one
_except_ the map that takes all field-elements to  0.  So he uses the
word "monomorphism" just to make clear that he is excluding that map.

Most modern algebraists specify that unless the contrary is stated,
all rings have  1  and homomorphisms are defined to take the element
1 of the domain ring to the element 1 of the codomain ring.  Therefore
the map between two fields taking everything to zero is not considered
a homomorphism, and one can simply say "homomorphism of fields".

However, for some reason I don't understand, authors of textbooks
feel they are being virtuous when they define rings in a way that
allows rings without 1.  Hence they can't specify that homomorphisms
must take 1 to 1; hence the zero map would count as a homomorphism
of fields if one didn't exclude it in some other way; hence Stewart's
use of "monomorphism" to achieve this.

As for what the Frobenius map is called by other mathematicians:
it is the "Frobenius endomorphism", where "endomorphism" means
"homomorphism from an object to itself".

_________________________________________________________________
You ask about  GF(25)  as a splitting field of  t^25 - t,  as in
Theorem 16.4, versus the assertion in Example 2, p.159 that it
is a splitting field of  t^2 - 2  (in each case, over  Z_5).

It is both!  The theorem has been proved.  On the other hand, since
t^2 - 2  is irreducible, its splitting field has degree 2 over
Z_5,  hence has  t^2 = 25  elements, hence must also equal  GF(25).

To get some additional insight, let us note that the theorem
shows that the splitting field of  t^25 - t  has 25 elements, hence
has degree 2 over Z_5, so its zeros must all have degrees 1 and 2.
Hence their minimal polynomials, the irreducible factors of
t^25 - t,  must have those degrees.  There will be  5  factors of
degree 1, namely  t, t-1, t-2, t-3, t-4,  hence the result of dividing
by the product of those factors, a polynomial of degree  25 - 5 = 20,
must be a product of quadratic factors, hence there must be exactly
10 of the latter.  Each of those ten quadratic factors will have
splitting field which has degree 2, hence is equal to  GF(25).  One of
those factors must in fact be  t^2 - 2  (since a zero of that
polynomial belongs to  GF(25), and so is a zero of  t^25 - t).

_________________________________________________________________
You both asked why the results of section 17.1 are stated only for
subfields of the real numbers.

Simply because Stewart is giving the solution to the classical problem
of constructing figures by ruler and compass, and this problem was
posed in the plane  R^2.  We could, of course, consider  K^2  to be 
a "plane over K" for any field  K;  but then we'd have a whole different
subject to examine.  E.g., over the real numbers, a line that passes
through the center of a circle intersects the circle; but if we consider
the "plane"  Q^2,  then the circle of radius  1  about the origin in
that plane and the line  y = x  don't intersect.  On the other hand,
in the "plane"  C^2  every line intersects every circle, which is also
very different from classical plane geometry.

The question of construction by ruler and compass is not studied so
much for its intrinsic importance as for its historical interest.  Since
its historical source involved the plane  R^2,  that is what Stewart
considers.  If we wanted to study "constructions by ruler and compass
in  K^2" for general  K,  we would first have to make precise how
geometric concepts and the action of a "ruler and compass" are defined
in that context.  I'm sure that some people have studied geometry
over general fields, but it would take us far afield to go into such a
theory here.

_________________________________________________________________
You ask how, on p.173, Stewart gets  t^2 + 4 t cot phi - 4  from (17.4).

Well, he has defined  phi  by the condition  tan phi = 4.  What does
that make  cot phi ?

_________________________________________________________________
You ask how many vertices of a 17-gon one needs to get them all.

Well, if, as on p.174, we are given the circle in which the 17-gon
is to be inscribed, and we have started off by choosing a point  A
that is to be one of the vertices, then all we need is one other
vertex -- any one -- to construct the whole thing.  Once we have
one such vertex, we copy the angle between it and the given vertex
and propagate it around the center until, after 16 such copyings, we
find ourselves back at the starting point, with all 17 vertices drawn.
So Stewart gives us more vertices than are needed on p.174.

Of course, if we count the starting vertex  A,  the answer to your
question is "two".  Finally, if we are not given the circle, but
just some points that are somehow known to be vertices of a regular
p-gon (p prime), then we need three to construct the polygon:  We
find the center of the circle on which they lie by intersecting
perpendicular bisectors of the segments connecting one of them to the
other two, then once we have the circle, we can use two of them to
get them all.

_________________________________________________________________
You ask about Stewart's statement on p.170, next-to-last paragraph,
that we can perform the ruler-and-compass construction by "faithfully
following the above theory".

Well, it would be a lot of work to follow through all the arguments
in preceding chapters and see how they apply to this case; but after
chasing down all those arguments, the mathematical steps that they
tell us to go through would be essentially what Stewart does on
p.171-173 without giving details on the reasons for doing it.

For instance, a key step in the preceding pages is Proposition 17.4,
which shows how to obtain by successive quadratic extensions any
extension contained in a normal extension of degree a power of 2.  This
involves finding an appropriate chain of subgroups of its Galois group;
the proof used general properties of p-groups.  In the approach to the
17-gon that I outlined Wednesday and commented on again today
(essentially the method in Stewart, but with more motivation), we
wrote down the Galois group explicitly, and saw the chain of subgroups
that was needed, without having to refer to the general theory.  And
so on.

_________________________________________________________________
You ask about Stewart's use of the term "perfect square in K" on p.180,
proof of Thm 18.2 (3).  He simply means "square of an element of  K."

_________________________________________________________________

Feedback results from 18/3 to 25/4
studyg	rev	thnkg	wrtg	*S

1{1}	1{1}	3{3}	1{1}	6{:6}	
3{3}	2{2}	6{6}	1{1}	12{:12}	
2{2}	1{1}	5[9]	1[2]	9[:14]	
2{2}	36{36}	10{10}	2{2}	50{:50}	

2	1.5	5.5[7.5] 1	10.5[13]

You ask about the second sentence in Stewart's proof of Lemma 19.3,
"By passing to a normal closure ...".

The point is that if we can prove that a normal closure  N  of an
extension  L:K has degree a power of  p,  then  [L:K],  being a divisor
of [N:K] will also be a power of  p.  Thus, if we can prove the stated
result for all _normal_ extensions  N:K,  it will be true for all
extensions  L:K;  so in proving it for an extension, we can without
loss of generality assume that extension is normal.

This is a common type of step in mathematical proofs:  to note that if a
result is true in cases where a particular condition holds, then it will
be true in the general case we want, and then to say "Hence we can
assume without loss of generality [in proving our result] that the
particular condition holds."  This can indeed be confusing if one is
not prepared for it.  In particular, it often involves an implicit
change of notation:  If Stewart had given different names,  L  and  N,
to the original extension and its normal closure, as I did above, then
he would be "re-naming"  N  as  L.

It also often leaves it to the reader to verify that the appropriate
relation holds; e.g., Stewart doesn't state that  [L:K]  is a divisor
of  [N:K],  so that if the latter is a power of  p,  so is the former;
he leaves that to the reader to see!  If you were giving such an
argument in your homework, you should make a step like that explicit;
if you are writing an article or a textbook, you need to weigh whether
the reader can be expected to see how to verify the required
implications.  Finally, as a reader of mathematics, you should note that
in a "we may assume" argument, the situation generally determines the
implication that must be verified for the argument to be valid; in
this case "If we proved  [N:K]  a power of  p,  then  [L:K]  will also
be a power of  p", and once you think through what that implication is,
you only have to find the reason that is true, in this case  [L:K]  is
a divisor of  [N:K]  by the tower law, and a divisor of a power of  p
is a power of  p".

_________________________________________________________________
You ask why, on p.187 near the end of the proof of Lemma 19.3, the
index if  P  in  G  is prime to  p.

Because  P  is a p-Sylow subgroup of  G !  Look at the definition of
"p-Sylow subgroup", and note that for  p  a prime "p  does not divide
r" is equivalent to "p  is prime to  r".

_________________________________________________________________