ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 113, Fall 2002, taught
from Rotman's "First Course in Abstract Algebra", 2nd ed.

----------------------------------------------------------------------
You ask in connection with Goldbach's conjecture on p.3, "When do
mathematicians declare that something is "impossible" to do or prove?"

The answer is: when they can _prove_ that it is impossible.  Proving
something impossible to _do_ we will see an example toward the end of
this course:  The field theory we will learn will show that it is
impossible to trisect the general angle using straightedge and compass,
as the Greeks hoped to.

As for something being impossible to prove, there are two senses in
which one can prove that.  One is to prove that from  A_1, ..., A_n
you can't prove  B  by showing that there are examples where
A_1, ..., A_n  hold but  B  does not.  Thus, one of Euclid's axioms
for geometry seemed "less obvious" than the rest, and for centuries
people tried to figure out a proof of it from the others.  Finally,
it was shown that there existed a "geometry" in which the others held
but that didn't; so that one can never be proved from the others.

More tricky is the situation where a result might truly follow, yet not
be provable.  This depends on precisely specifying the methods of proof
allowed, analyzing what can be done with them, and showing that certain
true results can't be obtained.  This is the domain of logicians, who
have proved that there exist such statements; but so far as I know, no
statements of interest for themselves have been shown to fall in that
category.

As for saying something is "impossible" when one really just means "I
give up", some mathematicians may do that privately, but they know
better than to make such an imprecise statement in print.

----------------------------------------------------------------------
You ask how in the proof of Proposition 1.3 on p.5 one can know that
a >_ 2,  as is needed to apply Theorem 1.2.

The only alternative to  a >_ 2  is  a = 1;  but if this were true,
then the equation  ab = m  would give  b = m,  contradicting the
assumption  b < m.

----------------------------------------------------------------------
You ask in connection with p.6 whether the determination of the least
criminal "implies the honesty of other statements in any indexed set
of statements".

No, if  S(n)  is the "least criminal", this implies that for all  i < n,
S(i)  is "honest" (that's what it means for  S(n) to be the _least_
criminal), but it doesn't say anything about the  S(i)  for i > n.

But the important way the "least criminal" principle is used is in
arguing that if there is _no_ least criminal, then all  S(n)  must be
honest.  That is the argument behind mathematical induction.

----------------------------------------------------------------------
You ask (with reference to the beginnings of the proofs on pp.5 and 7)
how much we are allowed to assume in this course.

Good question.  Basically, standard arithmetic properties of the
integers and real numbers.  This chapter develops many such facts
that we will use, while taking others for granted in proving those.
In later chapters, we will be off in the abstract world of general
groups, rings, and fields, and the question is rarely likely to come up,
because the material should be almost all new (unless, of course, one
has had another course in the subject, in which case one must not assume
results from that course).

----------------------------------------------------------------------
You ask in connection with the discussion on p.8 whether, in
doing a proof by induction, there any way to check whether S(n) is
true or not before we assume it.

If by "check" you mean "be absolutely sure it is true", this would
require you to have some other proof of it, and if you already had
such a proof, you wouldn't need to do the proof by induction.

But if you mean "are there ways to decide what results you think should
be true, before you try to prove them by induction", yes:
experimentation with small values of  n  (and "inductive reasoning"
applied to the result of your experiments), intuition, etc..
In particular, this applies to formulas like that of Proposition 1.7,
and the other formulas mentioned in footnote 4 on p.9.

----------------------------------------------------------------------
You ask how Exercise 1.13 on p.17 can be done by induction, given
that Rotman doesn't say that  n  has to be  >_ 0.

Looking at his hint on p.505, I think he intended  n  to be  >_ 0.
But I find that the result is also true for  n < 0,  and can be proved
by another induction:  If one knows it for  n = -m (m >_ 0) one can
prove it, from that case, for the case  n = -(m+1).  Hence, after
verifying the base case, one concludes that it is true for  n = -m
for all  m >_ 0;  i.e., for all  n _< 0.

Anyway, Rotman ought to say in the problem what values of  n  are
intended.  I will mention this in the letter of corrections I send him
at the end of the Semester.  Thanks for pointing it out!

----------------------------------------------------------------------
You ask how one could discover De Moivre's Theorem (p.26).

Well, given that one is interested in raising complex numbers to
powers, it is natural to apply the addition theorem (preceding
page) repeatedly.  One sees that when one does this, repeated
addition of angles gives expressions  "cos nx + i sin nx".  The
proof on p.26 by induction is just a way of precisely expressing
these observations.

----------------------------------------------------------------------
You ask about the statement "Thus, the nth roots of unity are equally
spaced around the unit circle" at the end of the top paragraph of p.30.

Well, at the end of the paragraph, one sees that  zeta^n = 1,  so
zeta  is an nth root of unity.  From the equation  zeta^n = 1,  it
is easy to deduce the equations  (zeta^k)^n = 1  for all  k;  so
all the  zeta^k  are nth roots of unity, and we have just seen that
these occur at successive angles of  theta.  I'm afraid Rotman didn't
notice that to "see" this, the student would have to notice that
zeta^n = 1  implies  (zeta^k)^n = 1,  and then review the paragraph
with this in mind.

----------------------------------------------------------------------
You ask why, in the last paragraph of p.30, Rotman notes that
zeta^k  is an nth root of 1 for  k = 0,...,n-1,  but doesn't
mention  k = n.

That is because when we reach  k = n,  the values of  zeta^k  start
to repeat themselves:  zeta^n = 1 = zeta^0, zeta^{n+1} = zeta = zeta^1,
etc..  So using  k = 0,...,n-1  one gets all the distinct powers
of  zeta.

----------------------------------------------------------------------
You ask why  (0,0)  is defined to be  0  (p.41).

Because  0  has the properties that it is a common divisor of  0
and  0,  and that every common divisor of  0  and  0  divides it.

So even though  0  does not have what I called in class the "naive"
property:  that of being a common divisor of  0  and  0  that is
largest numerically (no number has that property, because every integer
is a common divisor of  0  and  0,  and there is no largest integer),
it does have what I described as the "important" property.

Also note that the set of linear combinations of  0  and  0  consists
precisely of all multiples of  0.  So this other useful property of
gcd's also applies to this case.

----------------------------------------------------------------------
You ask about applying Theorem 1.29 (p.42) to the case where
a=24, b=36.

In that case,  s = -1,  t = 1  gives the gcd.

----------------------------------------------------------------------
You ask, regarding the proof on p.43 of Proposition 1.30:

> Rotman said necessity is ==>, and sufficiency is <===. I've always 
> thought it was the other way around. Take "if p, then q," for example.
> q could be true without p being true, so p is not necessary. if p is
> true, then it is sufficient to reach the conclusion that q is true.

I've always found these terms confusing, too.  Which direction they go
depends on which statement is the main clause of one's sentence.  In
the example you gave, "if p, then q", the main clause is "q", while
"if p" is a conditional clause; so you are correct in saying that there,
"==>" was a statement of sufficiency (the if-clause is sufficient for
the main clause to hold.)

But in Rotman's Proposition 1.30, the main clause, "d is their gcd",
comes before the other clause, "if and only if c|d for every common
divisor c", so in that case, the proof that the "if and only if" clause
was sufficient for the main clause to be true is "<==", and the proof
that it was necessary was "==>".

----------------------------------------------------------------------
You ask about Rotman's phrase "we claim that every element a in I is
a multiple of d" at the beginning of the second paragraph of the
proof of Corollary 1.31, p.43.

When a mathematician "claims" something, it means that he or she is
about to justify that claim by giving a proof.  The next few sentences
that Rotman gives are the proof:  They take the given element  a\in I
and show that it must be a multiple of  d.

----------------------------------------------------------------------
You ask about the step "Since  p|ab  it follows that  p|b" at the
end of the proof of Euclid's Lemma on p.43.  Rotman is assuming
that certain arithmetic implications are so straightforward that they
don't have to be pointed out to the reader:  If  p|ab,  then  p|tab;
also,  p|spb  because "p" is one of the factors shown; now since  p
has been shown to divide both  tab  and  spb,  it divides their sum.

However, every time I've taught 113 (using various texts) this point
has always caused some students trouble; so authors ought to give
more details.

----------------------------------------------------------------------
You ask why Rotman says on p.45 that the Greeks needed a "sophisticated
geometric argument" to prove that  a:b = c:d => a:c = b:d;  and you
note a 5-line calculation giving the same fact.

Your calculation translates  a:b = c:d  as equality of the numbers
a/b  and  c/d,  and the point Rotman is making is that the ancient
Greeks did not have the concept of an abstract real number such as
a/b.  They had a concept of length  a, b, c, d,  and they had the
concept of the relationship between two lengths being the same as the
relationship between two others, which to them was a geometric concept;
and these were all they could use in their reasoning.

----------------------------------------------------------------------
You ask about the formulas describing the Euclidean algorithm on p.47.

The remainders in question are successively smaller, r_0 > r_1 > r_2 >
... , and each remainder is gotten from the two that precede by
subtracting from the larger the largest possible multiple of the
smaller that leaves a nonnegative remainder.  So to get  r_3  one
takes  r_1 - q_2 r_2.  So one has  r_3  = r_1 - q_2 r_2,  which can be
written  r_1 = q_2 r_2 + r_3,  and this is the form Rotman uses.

----------------------------------------------------------------------
You ask whether the pseudocode for the Euclidean algorithm on
p.48 should first test whether  a >_ b.

Yes and no.  In Rotman's proof of Theorem 1.37, he does assume that.
But the definition of  remainder(a,b)  doesn't require it; if  a < b
then  remainder(a,b)  is simply  a,  and at the next step, when
the algorithm takes  d := s,  s := rem,  the result is just to switch
a  and  b.  Then, from that step on, we always have  d > s.

Well, thanks for pointing this out.  I'll include it in my list of
comments for the author at the end of the Semester.

----------------------------------------------------------------------
You ask about the dependence of Lame's theorem (p.51) on the fact that
b  is expressed in base 10.

This assumption is used on the 4th and 5th lines of p.52.  If one were
using another base  b,  one could replace the "5" in the result with
1/log_b alpha, or, of course, any number larger than that.  It happens
that when  b = 10,  1/log_b alpha  is very slightly less than  5,  so
the approximation is only slightly weakened by using  5  instead of
the exact value.  For arbitrary  b,  one could still replace
1/log_b alpha  with an integer larger than it, and so get a formula
like Lame's with some (possibly) different integer in place of  5.
But if this did not give a good approximation, one might prefer to
state the theorem using either the exact value of  1/log_b alpha,  or
some rational value slightly larger than this.

----------------------------------------------------------------------
You ask how Rotman gets  Sigma d_i b^i  _<  Sigma (b-1) b^i  on p.53,
first line of first display.

Remember that the  d_i  are all assumed  < b,  hence since they are
integers, they are  _< d-1.  So  d_i b^i _< (b-1) b^i;  so adding
up these inequalities, one gets  Sigma d_i b^i  _<  Sigma (b-1) b^i.

(Note what this says when  b = 10,  so that we are in base 10:  that
any k-digit number is  _< the k-digit number having all digits
equal to 9.  E.g., 2002 _< 9999.)

----------------------------------------------------------------------
You ask about the middle step in the first display on p.53

	  Sigma_i=0 to k  b^{i+1} - Sigma_i=0 to k  b^i

	= b^{k+1} - 1.

The phrasing of your question assumes that Rotman is saying
that the first summation equals  b^{k+1},  and the second equals
1; but this is not so.  (After all, this would say that the first
sum was equal to one of its summands.)

Rather, what he is saying is that if you compare the two summations,
they have the terms  b^1, b^2, ... , b^i  in common, so when you
subtract them, all those terms cancel out.  All that is left is the
b^{k+1)  from the first summation, and the  b^0 = 1  from the other;
and those are what he writes on the next line.

I don't know whether you took Math 1B in Berkeley; the relevant
techniques of calculating with summations are treated there.

----------------------------------------------------------------------
You ask how, after the last display on p.53, Rotman can say "Since
I  and  J  are disjoint, we may assume that  q < p."

From the fact that  I  and  J  are disjoint, it follows that  p  and
q  are different.  The idea Rotman then uses is that "there is no
essential difference" between the roles of  p  and  q,  so we can take
p  to be the larger of the two.  To say this more precisely, note that
p  is the largest of the values of  i  for which  d_i - c_i > 0,
i.e., for which  d_i > c_i,  and  q  is similarly the largest of the
values of  i  for which  c_i > d_i.  So if we happen to have  p < q
instead of the desired relation  q < p,  we can interchange the two
given expressions  Sigma c_i b^i  and  Sigma d_i b^i  for  m.  Using
the interchanged expressions, we find  q < p,  and can carry out the
same argument.

----------------------------------------------------------------------
You ask about Rotman's statement on p.55 that six lead weights may not
be enough to weigh a very heavy person, asking "can't you just increase
the weight of the lead weights?"

I removed these examples from the reading (as noted on the blackboard
Monday) both because they are too far from the subject of the course,
and because they are not clearly stated.  Rotman seems to be assuming
that everyone's weight is an integral number of pounds, and that one
wants to find that integer; and moreover, one wants a set of weights
that will work from  0  up to some particular weight  N.  With these
assumptions, the best one can do with 6 weights is all integers from
0 to 364.

(Instead of the silly assumption that everyone's weight is an integral
number of pounds, his idea may be that with this set of weights, one
can find for each person the two integers between which his or her
weight lies by seeing for which combinations of weights the scale
tips one way, and for which it tips the other way.  This can indeed be
done with the set of weights he refers to; but since he doesn't discuss
weighing non-integer people, we can only guess what he was thinking.)

----------------------------------------------------------------------
Regarding the proof of Theorem 1.40 (p.58) you ask "Doesn't this
proof beg the question of what it's trying to prove?"

Every inductive proof looks as though it is "begging the question"
if one forgets that the method of proof by induction is to prove the
general case _assuming_ the lower cases (and also to prove the base
case).  You should review the topic of induction, and then look at this
proof again.  If you still think it is "begging the question", write
me again saying very specifically where you think it is doing so.

----------------------------------------------------------------------
You ask about the relation between "max" used in the statement of
Prop.1.44, p.60, and "sup" (supremum).

"max" is used for the largest element in a set.  Hence if one has
a set like  S = {0, 0.9, 0.99, 0.999, ...},  max(S)  will be
undefined; but one can still find a smallest number that is  >_
all members of  S,  in this case  1,  and that is the meaning of "sup".

For a finite set, these are the same.  For an infinite set of real
numbers, the  sup  will exist if the set is bounded above, but
the  max  may not; if the max does exist, it will equal the  sup.
So when dealing with infinite sets, one generally writes  sup  (unless
one wants to emphasize that in a particular case, the element is in fact
a member of the set), and for finite sets, one generally writes  max,
as you suggest.

----------------------------------------------------------------------
You note that in the study of congruences (p.62 et seq) m is
always taken nonnegative, and you ask what happens if m is negative.

One can define  a == b mod m  in the same way; this condition will
be equivalent to  a == b mod -m,  since the multiples of  m  and of
-m  are the same.  It is seldom mentioned merely because it doesn't
give a new relation.  If one allows negative  m,  one has to make
a few adjustments in statements of results, e.g., in Prop. 1.46
one must replace "< m" by "< |m|" for it to hold for both positive
and negative  m.

----------------------------------------------------------------------
You ask about the relation between Prop. 1.45(i), a == a mod m  (p.63),
and the Remark after it, that (i) says that congruence is "reflexive".

Well, in mathematics one deals with various relations between two
numbers or other entities -- equality, congruence mod m, ">", ">_",
"divides", etc.. -- and various properties are given names.  If
"R" is such a relation, so that for elements  x  and  y  of a certain
set  X  it makes sense to say that  xRy  is true or false, then we
call  R  "reflexive" if for all elements  x,  xRx  is true (so
congruence mod m  is reflexive, but ">" is not).  Likewise, we
call  R  "symmetric" if the analog of (ii) holds, and "transitive"
if the analog of (iii) holds.

----------------------------------------------------------------------
You ask, with reference to Example 1.10, p.65, whether it suffices to
use a table for some proofs.

If the statement asserted in the proof can be proved to depend on
a small number of calculations, then those calculations can be given
as the proof, and the results displayed as a table.  I hope it is
clear to you why the result asked for in Example 1.10(i) just depends
on 8 calculations:  Because of Corollary 1.47, which says that  a
must be congruent to one of 8 values mod 8, and Prop. 1.48, which says
that the value mod 8 of the result of an arithmetic calculation with  a
depends only on the the value of a mod 8.

----------------------------------------------------------------------
You ask whether the Chinese Remainder Theorem (p.69) works for arbitrary
numbers of congruences, as long as the moduli are relatively prime.

The concept of "relatively prime" has to be made more precise when there
are more than two numbers involved:  There is a weak condition, that
the largest integer dividing all the numbers should be 1, and the
stronger condition that every pair of the numbers be relatively prime.
For instance, {6, 10, 15}  and  {6, 10, 11}  are sets that satisfy the
weaker condition but not the stronger;  {6, 13, 25} satisfies the
stronger (and hence the weaker).  {2, 4, 6} satisfies neither.

If the stronger condition is satisfied, we say that the numbers are
"pairwise relatively prime".  A finite family of congruences with
pairwise relatively prime moduli always has a solution.  This
is not hard to prove by induction from the Chinese Remainder Theorem.

On the other hand, a family of congruences whose moduli satisfy the
weaker sort of relative primality need not have a solution.  For
instance, to get a family of congruences with module 6, 10, 11 which has
no solution, simply take a congruence modulo 6 and a congruence modulo
10 which have no solution (say,  x == 0 mod 6  and  x == 1 mod 10),
and throw in any congruence modulo 11 (say, x == 2 mod 11).

Finally, if one has an _infinite_ set of congruences, even if the set
of moduli is pairwise relatively prime it may not have a solution.

----------------------------------------------------------------------
You ask where you can find a rigorous definition of "set" (p.82).

The word can't really be defined -- it has to be taken as a primitive
concept, in terms of which other mathematical concepts are defined.

However, one _can_ set down a small number of basic assumptions on what
is true about sets, from which one will deduce all the other properties
one will use; and one can also discuss informally what one thinks of
a set as being, and why one should or shouldn't make given assumptions
about them.

These things are (hopefully) done in Math 135, Set Theory.

You ask, in particular, how one defines "the ordered pair  (a,b)" (p.83,
bottom).  It turns out that this does not have to be taken as a
primitive concept; set theorists define it in terms of more basic set
operations as the set  {a, {a,b}}  (i.e., the set with two members, one
of which is  a, and the other of which is a set whose members are  a
and  b,  and which therefore has one or two members depending on
whether a = b).  However, this definition does not represent the
intuitive concept of an ordered pair -- it is simply a gimmick to get
the key property that an ordered pair is supposed to have:  That it is
a mathematical object which uniquely determines and is determined by
two elements, a "first component",  a  and a "second component",  b.

If you think of  (a,b)  as meaning the set  {a, {a,b}},  this will
not help you much; but if you think of it as an entity with the
properties mentioned above, and think of the construction  {a, {a,b}}
as showing that such an entity exists, it may help.

----------------------------------------------------------------------
You ask about the difference between codomain and range (p.85).

Rotman makes clear the distinction he is making between these words
when he introduces the terms on that page; I hope that distinction is
clear to you.  But the word "range" has been used by some authors with
the meaning "codomain" and by others with the meaning "image"; this
is the reason for the introduction of the terms "codomain" and "target"
in recent years to express the concept for which there was previously
no unambiguous term.  If you see "range" used by a writer, you need
to check whether he or she defines the sense in which it is being
used, or try to determine which is meant by the things said about it.

----------------------------------------------------------------------
You ask about the terminology of "inclusion" and "restriction" (p.85).

Well, if a set  X  is "included", i.e., contained, in the set  Y,  
then we get a map  X --> Y  which gives no information other than
what comes from this inclusion situation; so it is called the inclusion
map.

The concept of restriction is based on a more complicated situation --
where we not only have a subset  X  of  Y,  but also a map  f  from  Y
to another set  Z.  Then we get a map from the smaller set  X  to  Z,
taking each element  x  in  X  to  f(x) in  Z,  but not defined on
elements of  Y  that don't belong to  X.  This contains part of
the information given by the function  f,  but not all; it is gotten
by "restricting" the operation of  f  to the subset  X;  so it is called
"the restriction of  f  to  X".

The two concepts are related:  The restriction of  f  to  X  is the
composite  f o i  of the function  f: Y --> Z  with the inclusion
function  i: X --> Y.  But "composite" isn't defined until tomorrow's
reading (p.89).

----------------------------------------------------------------------
In connection with injectivity (p.88), you ask about the "horizontal
line test" from calculus.

That is equivalent to injectivity in the case where  f  is a function
from  R  to  R,  so that its graph can be looked at as a subset of the
plane  R x R.  When  f  is a function from some arbitrary set  X  to
another set  Y,  one doesn't have this option.  (Though you may find
it useful as a way of conceptualizing one-one-ness.  Everyone has their
own mental images that they find useful.)

----------------------------------------------------------------------
You ask about the relation between "one-to-one function" and
"one-to-one correspondence" (p.90).

As you point out, these do not mean the same thing: a "one-to-one
function" means an injective function, but a "one-to-one correspondence"
is equivalent to a bijective function.

The reason is that when one says "function", one is thinking of one
direction (a function  f: X --> Y  goes from  X  to  Y), while when
one says "correspondence" one is thinking of both directions.  (For
instance "x == y  mod 10" gives a correspondence between certain
integers  x  and certain integers  y.)  So when the modifier
"one-to-one" is attached to "correspondence", it affects both
directions, and gives a stronger meaning:  That each element of each
set corresponds to exactly one element of the other.

"Correspondence" is not a commonly used mathematical term in general;
but the particular phrase "one-to-one correspondence" is in common use,
so one simply has to learn what it means, and not confuse it with
"one-to-one function".

----------------------------------------------------------------------
You ask about terminology related to countability (p.94).

There are three "smallness" conditions on the number of elements in a
set  X  that are relevant throughout mathematics:  One may know that

	(i) X is finite,
or that
	(ii) X has the same number of elements as the natural numbers,
or that
	(iii) either (i) or (ii) holds.

Of course, the word "finite" describes (i).  The word "countable" has
been used in the past sometimes for (ii) and sometimes for (iii).
People have introduced various pairs of terms to fix this ambiguity:
"countable" for (ii) and "at most countable" for (iii) is evidently
what is used in your 104; "denumerable" for (ii) and "at most countable"
for (iii) is used in the text from which I am teaching 250A.  When
mathematicians talk to one another, context usually makes it clear
which they mean, so that even if they are used to using different words,
they know what the other means.

"Uncountable" always means "having more elements than the natural
numbers", in other words, not satisfying (iii); that's why in your 104,
"uncountable" doesn't mean the same as "not countable".

There is no mathematical disagreement between people using the two
terms; both agree that (ii) and (iii) are important.  (E.g., (iii)
describes the cardinalities of sets that can be written as images of
surjective maps from the natural numbers, i.e., which can be "listed"
in a sequence, with repetition allowed.)  It's just a question of words.

----------------------------------------------------------------------
You ask whether a subset of a countable set (p.94) is countable.

Yes!

----------------------------------------------------------------------
You ask whether  N  is infinite, and if so, whether in the definition of
"countable" on p.94, the conditions "X  is finite or  X  has the same
number of elements as  N" don't contradict each other.

Yes,  N  is infinite, but no, those conditions don't contradict each
other.  If they were connected by "and" they would contradict each
other; but since they are connected by "or", they simply make the
definition an inclusive one:  it says that the concept of "countable
set" includes both finite sets, and _certain_ infinite sets:  namely,
those infinite sets that have the same number of elements as  N.

As Rotman mentions, there are different sizes of infinity.  The number
of elements in  N  is the "smallest" infinity; so countable means,
roughly, "finite, or of the smallest infinite size".

----------------------------------------------------------------------
You ask how one proves that there are only countably many algebraic
numbers (p.94).

In brief outline:  One shows that there are only countably many
rational numbers, and from this that there are only countably many
polynomials with rational coefficients.  Each of these has only
finitely many roots (a polynomial of degree  n  has at most  n  roots),
so altogether this gives only countably many numbers that are roots
of such polynomials, i.e., countably many algebraic numbers.

(I don't know how much you've seen of countability before, so I can't
tell how plausible it will be to you that one can fill in the jumps in
the above sketch.)

----------------------------------------------------------------------
You ask whether the set of sizes of infinity is countable (p.94).

No -- it is worse than uncountable; it is so big that it doesn't form
a set!  (Set theory has limitations:  One can't talk about "The set of
all sets", or one runs into the paradox "Is the set of all sets that
are not members of themselves a member of itself?"  "All sizes of
infinity," like "all sets", is to big to form a set itself.)

----------------------------------------------------------------------
You note that the permutations Rotman gives as examples are bijections
on finite sets, and ask whether a bijection of an infinite set with
itself can also be called a permutation, in accordance with the
definition on p.97.

Certainly!  The structures we will be looking at in the study of
group theory in Math 113 will be mostly (but not exclusively) groups
of permutations of finite sets.  But mathematicians definitely do
study groups of permutations of infinite sets as well.

----------------------------------------------------------------------
You ask how the author goes between lines 4 and 5 on p.99, where he
seems to be substituting  beta  for  alpha.

Look back at what is being assumed -- the last line (except for the
footnote) on the preceding page.  The assumption is  gamma o alpha =
gamma o beta.  So at the stage you asked about, he is substituting
gamma o beta  for  gamma o alpha,  which are equal by assumption.

----------------------------------------------------------------------
You ask whether it is valid to describe an algorithm by giving an
example, as Rotman does for the factorization-into-disjoint-permutations
algorithm on pp.100-101, or whether it ought to be done in pseudocode.

There are actually three choices: (i) by example, (ii) by pseudocode,
or (iii) by a precise statement in words.  In order to do (ii), one
would need a programming language with a way of describing lists,
since even if the input to the algorithm is looked at as a single
function, the output is a list of cycles.  There are indeed such
languages, but it would be going too far beyond the topic of this
course to introduce one here.  But what Rotman actually does is
both (i) and (iii) -- he does (i) on pp.100-101, and (iii) in the
proof of Proposition 2.8 on p.103!  So the example on pp.100-101 is
just to give you the idea, not to replace a precise description.

The general question of whether it is appropriate to present a
mathematical procedure by giving an example, or whether it needs
a precise formulation, is a touchy one.  In our upper division courses,
we try always to give precise statements of the results in the
curriculum; but sometimes it may be of interest to give a peek at
something beyond the curriculum, which we don't have time to develop
in full detailed form, by just showing one or more examples.

(I understand that the ancient Babylonian mathematical cuneiform tablets
did everything by example, never stating general rules, but leaving it
for the student to learn the procedure from the examples.)

----------------------------------------------------------------------
You ask why, in the proof of Proposition 2.8, p.101, one wants to show
that  alpha(i_r) = i_1.

This is to show that  alpha  behaves on the set  {i_1,...,i_r}
like the r-cycle  (i_1,...,i_r).  In the definition of an r-cycle
on p.99,  alpha(i_r) = i_1  is the last of the list of equations
that the cycle  (i_1,...,i_r)  is required to satisfy.

----------------------------------------------------------------------
You ask about Rotman's statement in the proof of Prop. 2.8 on p.103,
that "the list i_1, i_2, ... must eventually have a repetition, because
there are only n possible values...".

First, do you understand how this list is formed?  By taking  i_1
to be any element that is moved by alpha, taking  i_2,  to be
alpha(i_1),  and so on -- for each  n,  we take  i_{n+1}  to be
alpha(i_n).  Now with this way of getting one element after another, we
could go on forever, right, getting  i_i, i_2, ...,  i_100, i_101,
... , i_100,000, etc., even if alpha was just a permutation of
{1,2,3,4,5}  (for example).

But if alpha was a permutation of {1,2,3,4,5},  all the elements
i_n  that we get must belong to {1,2,3,4,5}.  So these infinitely
many elements can't all be _different_; i.e., there must be repetitions.

If this is still not clear, you should come to to office hours so that
I can try to clear up the difficulty.  If it is now clear, you should
re-read that proof in Rotman and compare what I have said with what
he says.  The way he states this is a common way of arguing in
mathematics, so you need to familiarize yourself with it.

----------------------------------------------------------------------
You ask whether Proposition 2.13 on p.108, stating that if  n>_2  then
every  alpha  in  S_n  is a product of transpositions, also holds for
S_X,  where  X  is any set.

For finite sets of  >_2  elements, yes.

If  X  is an infinite set, it will have some permutations that move
infinitely many elements.  (E.g.,  Z  has the permutation given by
f(x) = x+1.)  These cannot be written as products of transpositions.
However, any permutation which moves only finitely many elements
can still be written as products of transpositions (e.g., the
permutation  (-1 0 1 2 3)  of  Z,  which moves only  -1, 0, 1, 2 and 3,
and fixes all other integers.)

----------------------------------------------------------------------
You ask how Rotman gets the four factorizations of (1 2 3) into
transpositions given in the first display on p.109.  Well, the
first one is an application of the last displayed formula on p.108.
The second can be gotten from another formula like that, which
uses factors of the form  (i r)  instead of  (1 i).  For the third,
I can't be sure, but a guess is that he wanted to show that one
could even have factorizations involving symbols not in the original
permutation, so began with "(1 4)" at the end, then, seeing that he
had to make the remaining permutations have product  (1 2 3) (1 4)^-1,
he figured out that permutation, and wrote it as a product of
transpositions.  Finally, he obviously got the last one by multiplying
the third one by  (2 3)(2 3)  at the end, using the fact that the
square of a transposition is the identity.

----------------------------------------------------------------------
You ask about the statement near the top of p.116 about operations
being single-valued: "when one says it explicitly, it is usually
called the law of substitution".

Rotman certainly isn't being clear there.  In the phrase "it is usually
called the law of substitution", the "it" doesn't mean the operation,
it means the fact that it is single-valued.  In the display that he
writes as the law of substitution, the idea is "If you put the same
values into the operation, you get the same answer"; he is using this
as a way of expressing the fact that the operation is single-valued.
But he really isn't clarifying anything, and I will recommend that he
drop this comment.

----------------------------------------------------------------------
You ask, regarding the definition of group on p.116, whether a group
G  can have more than one operation.

A given set can have different group operations on it; but if we use
a different operation, it is then considered a different group.  We
then speak of the result as "two different groups having the same
underlying set".

----------------------------------------------------------------------
You ask whether an abelian group has to satisfy the associative law,
noting that the definition at the bottom of p.116 doesn't say so.

It definitely does have to satisfy the associative law!  The definition
that you refer to begins "A group  G ...".  The word "group" was defined
earlier on the page, so when the author says "A group  G" this means
a set with an operation satisfying all the conditions specified in
the earlier definition -- including associativity.

----------------------------------------------------------------------
You ask whether the group of motions of the plane, defined on p.128,
is the same as the group of permutations of the plane.

No, there are permutations that are not "motions of the plane".  As
specified in the Definition on p.128, motions are distance-preserving
permutations; but there are permutations that don't preserve distances.
The easiest examples are things like  (x,y) |-> (x,2y).  A more
complicated example is  (x,y) |-> (x,y + sin x).  And there are 
examples that are discontinuous; e.g., the function that takes
(x,y)  to  (x+1,y)  if  x  is an integer and to  (x-1,y)  if it is not.

----------------------------------------------------------------------
You ask about the statement on p.129 that if  Delta  is a triangle
with vertices  P, Q, U,  and  phi  is a motion such that  phi(Delta) =
Delta,  then  phi  permutes the vertices of the triangle.

This means that  phi,  when applied to the set  {P, Q, U},  carries
this set into itself, and gives a bijection of the set with itself,
i.e., a permutation of the set.

----------------------------------------------------------------------
You ask about the equation  |Sigma(Delta)| = {1},  two lines above
Example 2.18 on p.130; in particular, why the  1  is in set-brackets.

Good observation -- you've found another typo in Rotman!  I'll add it
to the list I will send him at the end of the Semester.  He was
obviously thinking of two true statements at the same time:  first,
that  |Sigma(Delta)| = 1,  and second, that  Sigma(Delta) = {1},  i.e.,
the group of symmetries consists of the identity permutation alone.

----------------------------------------------------------------------
You ask about the relation between subgroups (p.134) and subspaces.

In every area of algebra, when one has a sort of object defined
as a set of elements with some operations on them satisfying
certain conditions, one can define a "subobject" to be a subset
closed under these operations and satisfying the same conditions.
So it is one of a large class of such concepts, of which you will
see more:  Subrings and subfields (after we have defined "ring" and
"field"), in particular.

Also, since every vector space  V  can be regarded as a group, using
its structure of addition and ignoring its operation of scalar
multiplication, one finds that every subspace of  V  will be a
subgroup.  But the converse is not true:  A subgroup which is not
closed under scalar multiplication is not a vector subspace.

----------------------------------------------------------------------
In connection with the definition of nontrivial subgroup on p.135,
you ask whether the empty set is a nontrivial subgroup.

It would fit the definition -- if it were a subgroup.  But since it
doesn't satisfy condition (i), "1\in H", it isn't a subgroup.

Because of condition (i), the smallest subgroup of any group is {1}.
So the idea behind "nontrivial" is "not the smallest subgroup".

----------------------------------------------------------------------
You ask about sets being "closed" (p.135) or "open" in a group.

First, as I said in class, I think Rotman's use of "closed" does not
say enough; the proper usage is "closed under the group operation".
(This is distinct from "closed under taking inverses", "closed under
conjugation", etc., each of which is a useful concept.)

As for "open", that term is not used in group theory.  It is used
in analysis, where the complement of a closed set has useful properties,
so such a set is called "open" (and most sets are neither open nor
closed).  However, in group theory, the complement of a closed set is
not a particularly interesting sort of set, so one doesn't give such
a set a special name.

----------------------------------------------------------------------
You ask about the difference between a cyclic subgroup (p.137) and a
ring.

"Ring" is a concept we will come to later in the course -- a set
given with two operations, called addition and multiplication,
satisfying certain properties.  There is no direct connection with
cyclic subgroups.

When you raise a question like this, it is hard to know how to answer
it, because I have to guess what the word "ring" means to you in a
mathematical context, and what has led you think it might be connected
with the concept of cyclic subgroups.  So if you bring into one of
your questions something we have not studied in the course and that
is not treated in one of the prerequisite courses, you should say what
you know about the concept, and what, if anything, leads you to think
it may be connected with the material.  It could be as simple as "I
have heard of a concept of `ring' in algebra, but I don't know what it
means, and wondered whether it is similar to that of a cyclic group",
or it could be "In a course I took elsewhere, we were shown the concept
of `ring', namely ...".

----------------------------------------------------------------------
You ask what Rotman means, in the 4th line of the proof of Prop. 2.28,
p.137, when he says that the equation  a^{k-i} = 1  contradicts the
original list having no repetitions.

In the first sentence of the proof, he chooses  k  such that
the sequence  1, a, a^2, ..., a^{k-1}  consists of  k  distinct
elements.  "Distinct" means that no two of them are the same, i.e.,
that the list has no repetitions.  On the other hand, the equation
a^{k-i} = 1  says that a certain term of that list repeats the first
term,  1,  contradicting that assumption.

----------------------------------------------------------------------
Regarding the last line of the proof of Proposition 2.28 on p.138, where
Rotman writes  (a^k)^q a^r=a^r, you ask "Why?  (what happens to k and
q?)"

In the preceding paragraph, he has shown that  a^k = a^0 = 1.  Hence
when you see the  "a^k"  in the expression  (a^k)^q a^r,  you should
automatically substitute "1" for it.  When you do this, the expression
clearly simplifies to  a^r.

----------------------------------------------------------------------
You ask about the term "special linear group" on p.149.

When you encounter a term that you don't recognize, and which
Rotman does not define on the page in question, the first thing
you should do is check in the index.  In particular, the index
shows where he has defined "special linear group".

----------------------------------------------------------------------
Regarding the definition of normal subgroup, p.150:

> Is there any good way to tell if a subgroup K is a normal subgroup,
> other than checking that it contains all the conjugates of its
> elements? That might be rather difficult, if G is infinite, for
> example.

True, but if elements of the group are described by some sort of
formula, you can do a general calculation with this formula.  Note
also Rotman's easy verification that the center of a group is a
normal subgroup in the last paragraph of p.151.

The commonest way of verifying that a subgroup is normal is to
show that it is the kernel of a homomorphism.  As for showing that
a subgroup is not normal, note that it suffices to find one element
of the subgroup and one conjugate of it that isn't in the subgroup.

----------------------------------------------------------------------
You ask about the verification that  Z(G)  is a normal subgroup of
G  in the last paragraph of p.151.

The wording of the first sentence of that paragraph is a little
ambiguous, and I can see that Rotman's use of the semicolon led you to
think he intended the second part of the sentence as the explanation
of the first part.  But that is not what he meant.  The first part
of the sentence really means "It is easy to check that  Z(G)  is a
subgroup of  G", and Rotman intends you to do that check yourself;
to verify from the preceding definition of  Z(G)  that it satisfies
the conditions for being a subgroup.  Then, in the second half of
the sentence, he shows why that subgroup is normal.

----------------------------------------------------------------------
You ask about the step "a + km = b + (l+k)m" in the 4th line of
the proof of Prop.2.41, p.157.

Since the step involves the symbol "l" (the letter "ell") suddenly
appearing in the equation, you should ask yourself "Has Rotman
defined  l?  Has it appeared before?"  If you look at what immediately
precedes, you will see that it is introduced on the line just above,
where he says  "a - b = lm  for some integer  l".  So you should
ask yourself "Can I use that equation  a - b = lm  to turn  a + km
into  b + (l+k)m ?"  Hopefully, you will see that you can:  The
equation  a - b = lm  can be written  a = b + lm,  which implies
a + km = b + lm + km,  which simplifies to  b + (l+k) m.

----------------------------------------------------------------------
You ask about Rotman's arguments in the second paragraph of the proof
of Theorem 2.48, p.162.

That argument works, but it is messy.  A simpler argument is to note
that if a number is congruent to  -1  mod m,  it is relatively
prime to  m,  but if  m  is composite, it has a nontrivial divisor
a < m,  which will be one of the factors in  (m-1)!,  so  (m-1)!  will
not be relatively prime to  m.

----------------------------------------------------------------------
You ask about the First Isomorphism Theorem (p.166) in the case of
a linear transformation of vector spaces.

If we just consider the First Isomorphism Theorem as stated and proved
in this chapter, for groups, it can be applied to a linear map
T: V --> W  of vector spaces, because such a linear map will (in
particular) be a group homomorphism; but it won't tell us that the
null space is a subspace; only that it is a subgroup.  However, there
are several versions of the First Isomorphism Theorem (check out
"first isomorphism" in the index) and the one for vector spaces
does tell you that the null space is a subspace.

You ask what the quotient looks like in the case of linear
transformations.  Well, consider the case where  V = R^3  and
W = R^2,  and  T  is defined by  T(x,y,z) = (x,y).  Then the null
space is the z-axis, the cosets of the null space are all the lines
parallel to the z-axis, and one can add such cosets by just adding
their x- and y- coordinates, getting another line parallel to the
z-axis, or multiply a coset by a scalar by just multiplying its
x- and y- coordinates by that scalar.  In this way, the set of
all lines parallel to the z-axis becomes a vector space, isomorphic
to the (x,y)-plane,  R^2.

This is not an ideal example, because one is tempted to think of
R^2  as a subspace of  R^3  (identifying  (x,y)  with  (x,y,0)), though
when one has a linear map  T: V --> W  there is not generally a
natural way to identify the image of  T  with a subspace of  V.
But I wanted to keep it simple.  Hope it helps.

----------------------------------------------------------------------
You ask about the formula "f^-1 (x) = |H \intersect K|" in the second
line of the proof of Proposition 2.54, p.168.

That formula  should be "|f^-1 (x)| = |H \intersect K|"; i.e., the
author means that the inverse image of each element  x  -- which is
shorthand for  f^-1 ({x}),  the inverse image of the 1-element set
{x} -- has the same number of elements as the subgroup  H \intersect K".

This is, in fact, what he proves in the second paragraph of the
proof:  That the pairs of elements  (h', k')  (h'\in H, k'\in K)
with  h'k' = x  (i.e., the members of  f^-1 ({x})) can be put in
one-to-one correspondence with the elements  d\in H \intersect K;
by writing each such pair as  (hd, d^-1 k).  This is what shows that
the two sets have the same number of elements.

----------------------------------------------------------------------
You ask whether in the Second Isomorphism Theorem, p.168,  HK/H  can
be described as  H(K/H).

No.  H(K/H)  is meaningless because  K/H  is meaningless.  We have
only defined a quotient of a group by a normal _subgroup_ of that
group, and our hypotheses don't say that  H  is contained in  K,  so
it is not a subgroup of  K.  The point of forming  HK  is to "fatten
K  up" so that it will have  H  as a subgroup.  Then we can form the
quotient of it by that subgroup.

----------------------------------------------------------------------
You ask about the diagram on p.171, illustrating the Correspondence
Theorem.

On the left it shows the group  G,  the normal subgroup  K,  and
two intermediate subgroups,  S  and  T,  as in the next-to-last
full line of the preceding page.  On the right are shown the
images of these groups under the homomorphism  pi:  G --> G/K.
(Note that  pi  takes all elements of  K  to the identity element
of  G/K,  hence the image of  K  under  pi  is just  {1}.)  The
diagonal arrows show the map  pi  taking groups on the left to
their images on the right.  Those arrows are shown as going downward
because those images are generally "smaller" than the original
group.  (If  G  is finite, then for each  S,  the order of  pi(S)
is  [S:K] = |S|/|K|.)

----------------------------------------------------------------------
You ask why Rotman begins the proof of Proposition 2.61, p.173,
by showing that for  g = hk,  the  h  and  k  are unique.

This is so that the function  phi(g) = (h,k)  introduced at the
bottom of the page will be well-defined.

(You refer to that as "the Euler function", but it is not.  When
one works with the Euler function one generally writes it  phi,  but
not everything called  phi  is the Euler function;  phi  is simply
a letter of the Greek alphabet, that can be used for any purpose.
When the symbol  phi  is not given a different meaning, and when
the context is such that the interpretation "number of integers
_< n that are relatively prime to  n" makes sense, you can feel safe
in assuming that  phi  stands for the Euler function.  But if this is
not the case, it just stands for whatever an author wants to use it
for.  Note similarly that the  pi  of Proposition 2.60 does not mean
the number 3.14159... .)

----------------------------------------------------------------------
You ask about the translation functions  tau_a: G --> G  defined by
tau_a (x) = ax  in the proof of Theorem 2.66, p.178, and whether this
is the group operation.

The "ax" in that definition does mean the group operation.

But the important thing to understand about this definition is that
while the group operation is a function of two variables,  G x G --> G,
this definition has re-encoded the information in that function as
a family of functions of one variable.  Each of those functions is
a permutation of the set of elements of  G.  (The group operation
itself, being a function of two variables, is not a permutation.)

Then the definition of  psi  re-encodes this information once more.
Where we had a family of elements of  S_G,  we now get a single
function taking  G  to  S_G.

----------------------------------------------------------------------
You ask about the step  tau_a (bx) = a(bx)  in the third line
of the proof of Theorem 2.66, p.178.

This is an application of the definition,  tau_a (x) = ax,  in
the first line of the proof.  That definition is applicable
for all elements  a  and  x  of  G;  at this step we are applying
it with the element "bx" in the role of the "x".

You also ask "Does  tau_a = a?"

No!  The element  a  is a member of the group; while  tau_a  is a
function from the group to itself.  That function acts by multiplying
by  a;  but a function on the group is not the same as a group element.

----------------------------------------------------------------------
You ask about Rotman's statement, on p.180, lines 3-4, that "to
isomorphism, there is just one group of order  p".

First of all, "to isomorphism" is a typo for the longer phrase
"up to isomorphism".  The statement that "up to isomorphism" there is
just one such group means that if we consider groups that are isomorphic
to one another as "the same", then there is just one such group.
Precisely stated, it means that there is a group of order  p  which
all other groups of order  p  are isomorphic to.  Similarly,
Proposition 2.68 shows that up to isomorphism there are exactly two
groups of order  4.  In other words, there are two groups of order
4  which are not isomorphic to each other, but such that every group
of order 4 is isomorphic to one or the other of them.

(In the same spirit, one can say that "up to congruence mod 10",
the equation  2x == 8  mod 10  has just two solutions,  4  and  9.
In other words, every solution to that congruence is congruent either
to  4  or to  9  mod 10.)

----------------------------------------------------------------------
You ask why 4 and 6 are "special cases" in the discussion on p.180.

It is because, of all the the integers between 1 and 7, those are
the only composite numbers.

However, among larger numbers, _most_ are composite; so if we were,
say, trying to find all groups of order < 100, most of the values
would require special consideration, as 4 and 6 do here.

The subject of determining the structures of all groups of order  n
for given  n  is an enormous one; what Rotman shows you here is
just the tip of the iceberg.  Another little bit of information is
given in Corollary 2.76, p.188; one learns a little more of the subject
in Math 250A; still more may be presented in Math 257 (depending on
who is teaching it; if someone teaches a version emphasizing infinite
groups, the question of groups of finite order  n  might not come up).
Beyond that, there are thousands of pages of published papers.

----------------------------------------------------------------------
You ask, regarding the table on p.182,

> Is there a known function for the number of nonisomorphic groups
> of a given order, or has the data in Table 2.4 been gathered by
> brute force methods?

Brute force!  I assume a computer was used.  Of course, there is some
subtlety involved; if one just generated random multiplication tables
and checked which were groups and which were isomorphic, it would
take absurdly long, even by computer.  Theorems about the structure
of p-groups must have been used, so that the computer just had to count
ways that certain things could be "put together".  But combinatorial
questions like this (except for the very simplest ones, such as "how
many ways are there to choose  r  objects from a set of  n  objects?")
rarely have exact-formula solutions.

Notice that the orders listed in the table are the powers of 2.
Powers of primes are the orders for which the number of groups gets
large the fastest.  If one had a full table of the numbers of
groups of each order from 1 to 512, the values collected in this table
would stand out like sore thumbs.

Also note the point Rotman is making:  That knowing all groups of
a given order is not a reasonable goal.  (Probably no one has looked
at the multiplication tables of all ten million plus groups of order
512!)  Rather, one wants to prove general results about groups.

----------------------------------------------------------------------
You comment that in that table on p.182, "the number of distinct groups
of each order increases dramatically.  But it doesn't seem to be a
patterned growth."  It is, of course, the "dramatically" part that
has led Rotman to include it.  But I had been wondering about the
pattern, and your question led me to look at it more closely.  Based
on that chart, it seems that the number of groups of order  2^n  is
approximately  2^{2^{n/2}}.  Or to put it another way, to get an
approximation of a value on the right-hand side of that chart, take the
square root of the value on the left, and raise 2 to that power.  E.g.,
the first entry is 16, whose square root is 4, and 2^4 is close to the
number in the right column; the second entry is 32, whose square root
is between 5 and 6, and the entry on the right is between 2^5 and 2^6.

----------------------------------------------------------------------
You ask, in connection with the action of a group on itself by left
translation (p.182), whether a group can act on itself by right
translation.

Yes and no.  If we defined  alpha_g (a) = ag,  this would fail to
satisfy one of the conditions for a group action:  alpha_g alpha_h
would not come out to  alpha_gh,  but to  alpha_hg.  But this could
be "cured" by defining  alpha_g (a) = a g^-1.

In fact, note that the action of  G  on itself by conjugation
can be thought of as the combination of its action on itself by
left translation, and its action by the above modified version of
right translation.

----------------------------------------------------------------------
You both asked about Figure 2.13 on p.184.  I'm not sure whether I
remembered to point this out in class, but you are right, the last
square on the top row is mis-labeled: "2" and "3" are interchanged.

----------------------------------------------------------------------
You ask whether the statement at the bottom of p.184 means
that  x^G = O(x).

It means that for _this_ action of  G  on itself,  x^G = O(x).
But "O(x)" is a concept applicable to any action of  G  on any
set, while "x^G" is a notation specific to the action of  G  on
itself by conjugation.

----------------------------------------------------------------------
You ask about the concept of two elements  x  and  y  being in the
same orbit, and whether this means "x, y \in O(x_i)  for some  x_i".

It can be expressed that way, but that is more complicated than
necessary.  The key fact is the _first_ sentence of Proposition 2.70
(p.185):  "If  G  acts on a set  X,  then  X  is the disjoint union
of its orbits".  This shows that if two orbits have anything in
common, then they are the same.  Now the orbit  O(x)  contains the
element  x,  hence any orbit containing  x  must be the same as  O(x);
so if  x  and  y  are both in some orbit, that orbit must be  O(x);
so the condition "x  and  y  are in the same orbit" can be expressed
as  y \in O(x),  in other words,  y = sigma x  for some  sigma \in G.

----------------------------------------------------------------------
You ask about the need for finiteness in Proposition 2.70, p.185.

Finiteness is assumed so that the terms in the equation will be
numbers.  If one is familiar with the arithmetic of infinite
cardinals, and interprets the equation in terms of cardinals, then
one doesn't have to assume finiteness.  (However, the main use of
the equation is in arguments involving divisibility etc., so the
generalization to infinite cardinals is not that helpful.)

----------------------------------------------------------------------
You ask how Corollary 2.72 (p.186) follows from Lagrange's Theorem.

Actually, it follows from the proof of Lagrange's Theorem.  That
proof shows that  |G| = [G:H] |H|.  Lagrange's Theorem is the
consequence that  |H|  divides  |G|,  but what is used in
Corollary 2.72 is the opposite consequence, that  [G:H]  divides  |G|.

Thanks for raising the question; I'll point this out to Rotman ... .

----------------------------------------------------------------------
You ask whether what is used in the proof of Theorem 2.74, p.187, is
the second form of induction, and if so, whether we should have started
by proving two cases.

Yes it is the second form, but no, that does not require starting
with two cases.  That was needed when we were proving things about
Fibonacci numbers, because the statement that each Fibonacci number
is the sum of the two preceding only applies after the first two;
so knowing a result for "all lower" Fibonacci numbers isn't enough
until one reaches the third Fibonacci number.  But the proof of this
Theorem works with precisely the assumption that the result is true
for all groups of smaller order.  If in doubt, take some number  n  for
which you are not convinced that the proof should work (e.g., n = 2?)
and, assuming the statement of the theorem true for all smaller  n,
go through the reasoning of the proof, and see whether proves what it
claims.

(For a group of prime order, you will find that the Sigma-term in the
equation in the proof turns out to have no summands, since there are
no elements not in the center; so that sum is  0.)

----------------------------------------------------------------------
You ask about the result that if  G/Z(G)  is cyclic, then  G  is
abelian, i.e., the result of Exercise 2.83 used in the proof of
Corollary 2.76, p.188.

The argument I gave at the end of class to prove that Corollary
without calling on that exercise was basically the same as the
argument used to prove the exercise.  Here it is in general:  Suppose
G/Z(G)  is cyclic, generated by the coset of  x\in G.  Then  G
is the union of the cosets  x^i Z(G).  To show  G  commutative,
it suffices to show that the centralizer of each element  x^i c
(c\in Z(G)) is all of  G.  Now each element of  Z(G)  is in
that centralizer, because it commutes with everything in  G,  and
x  is also in the centralizer, since  x (x^i c) = x^{i+1} c =
x^i x c = (x^i c) x.  Hence since the centralizer is a subgroup,
it contains everything of the form  x^j c' (c'\in Z(G)), i.e., all
elements of  G.

----------------------------------------------------------------------
You ask how, in the proof of Prop.2.77, p.189, we know that  Z(G)
is normal in  G.

Rotman showed this in the last paragraph of p.151.

It also follows from an observation I have emphasized in class:
that if two elements  x  and  y  commute, then conjugation by  x
doesn't change  y.  Since by definition, elements of  Z(G)  commute
with all elements of  G,  conjugating an element of  Z(G)  by any
element of  G  leaves it unchanged, hence leaves it in  Z(G),  showing
that  Z(G)  is normal.

----------------------------------------------------------------------
You ask about the meaning of "S*" in the second line of p.190.

What Rotman is saying there is that the inductive assumption gives
a subgroup of  G/Z(G)  of order  p^{k-z}.  That part, hopefully, is
straightforward.  He gives it the peculiar name "S*" to fit the
notation used in the Correspondence Theorem, which he applies next;
if you look up that theorem in the index and turn back to it, you will
see how the "*" is used there.

----------------------------------------------------------------------
You ask about the claim in Exercise 2.101 (p.194) that the n=5 case
was done in Lemma 2.80.  You're right that this isn't correct as
stated.  My best guess is that Rotman meant "Lemmas 2.79 and 2.80".
I'll mention this in my letter to him.

----------------------------------------------------------------------
You ask what Rotman means when he says, on p.204, that division by 0
is "forbidden".

He is not stating this as a new idea; he is taking it as a general
fact that everyone knows, but doesn't necessarily understand the
reason for -- "You can't divide by zero!" -- and pointing out that
the reason is that multiplication by zero is not injective, so it
can't have an inverse.

Precisely when one can divide a number (or more generally, a
ring-element) by another depends on a lot of things; e.g.,  3  can
be divided by  2  in  Q  but not in  Z.  In some rings there are
elements  x  other than zero such that multiplication by  x  is not
one-to-one; for instance, in  Z_10,  multiplication by  2  isn't,
since  2 x 5 = 0 = 2 x 0.  But in any ring, multiplication by 0
is the opposite of one-to-one: it is "everything-to-one"; so division
by  0  is always impossible.

----------------------------------------------------------------------
You ask whether a commutative ring (p.205) can have an element which
acts like an identity for both addition and multiplication.

If  i  is such an element, then just using the fact that  i  acts like
an identity for addition, one can use the proof Rotman gives for
"0 . a = 0" to conclude that  i . a = i  for all  a\in R.  But the
fact that  i  acts as an identity for multiplication shows that for
i . a = a  for all  a  in  R.  Putting those equations together, we get
i = a  for all  a\in R;  in other words,  i  is the only element of  R.

In this situation,  R  will not satisfy Rotman's definition of a
commutative ring, because his condition  1 notequal 0  implies that
there are at least two elements, while the  R  described above has just
one element.  However, many mathematicians (including myself) prefer to
define "ring" so as to allow this case.  Thus, we do not have the
condition  1 notequal 0  in our definition, and we call the 1-element
ring "the trivial ring".

----------------------------------------------------------------------
You ask,

> ... How is it decided which of these rules should be axioms?

I mentioned in class the example of the definition of "domain" (p.208)
and the following Proposition 3.5.  Since the proposition shows that
a commutative ring is a domain in Rotman's sense (i.e., satisfies the
cancellation law for multiplication) if and only if the product of every
pair of nonzero elements is nonzero, it follows that the concepts
"a ring satisfying the cancellation law" and "a ring in which every
product of nonzero elements is nonzero" are equivalent; so either might
be taken as the definition of "domain".  And in fact, most authors use
the latter, while Rotman uses the former.

The reasons for choosing one or another condition are various:
pedagogical, esthetic, etc..  I think that Rotman makes his choice
for a pedagogic reason:  students are aware that they often use the
cancellation property in their calculations, but not that they use the
fact "a product of nonzero elements is nonzero", though it is really
equivalent.  So the concept as he defines it is more likely to strike
the student as a natural one than the other.  On the other hand,
most mathematicians use the other definition because it is logically
simpler -- it involves just two elements instead of three, it
involves them in a symmetric way; and the fewer elements are involved,
the easier it is to check whether something has a stated property.

Another sort of consideration in deciding what set of conditions to take
as axioms (i.e., to include in a definition) is the relation to other
concepts.  For instance, though one might have taken  a(bc) = b(ca)
as one of the conditions in defining a commutative ring, and with the
help of condition (iv) (existence of identity element) deduced both
commutativity and associativity of multiplication from this, the
present system of definitions makes apparent the relation to
noncommutative rings (same definition without the commutative law),
to nonunital rings (same definition without existence of identity
element), and also to other mathematical objects which involve
associative operations.

The above discussion concerns different systems of axioms which are
equivalent, and so define the same class of objects.  Another sort of
question one could ask is how one chooses among _different_ classes of
objects (defined by systems of axioms that are _not_ equivalent), and
decides which are worthy of study.  But I won't go into that now.

----------------------------------------------------------------------
You ask about the statement on p.209, after the last display, that
since  S  is an additive subgroup of  R,  S  is an abelian group,
asking whether Rotman is saying that every additive group is abelian.

The convention is that a group is not _written_ additively
unless it is abelian.  However, that alone does not prove  S  is
abelian; it just shows that if it weren't, it would not be appropriate
to write it additively.  The fact Rotman is using is that any subgroup
of an abelian group is abelian.

----------------------------------------------------------------------
You ask whether the units (p.213) of the ring of real numbers are just
1 and -1.

Well, try another real number, such as  2.  Does  2|1 ?  To answer
that, you have to remember that the definition of "a|b" is that there
is an element  c  _of_the_ring_ such that  b = ac.  So is there a
real number  c  such that  1 = 2 c?

I think you will see that the answer is yes -- c = 1/2.  So  2  is
a unit of  R.  So  1  and  -1  are not the only units of  R.

----------------------------------------------------------------------
You ask how, in point (iv) on p.218, Rotman goes from the
equation  x = g^-1 y  to  y == x.  (I am using "==" for the
equivalence-relation symbol.)

Look at the first display in point (iv), which defines what
it means for two elements to be related by  ==.  We want to
show that  y == x;  i.e., we want to apply it with  y  in the
role of the "x" of that display, and  x  in the role of the "y".
So we need to find some element  h\in G  such that  x = hy.
(I am writing "h" instead of "g" because in our situation,
"g" already refers to a specific element.)  By the equation
Rotman had just proved, the choice  h = g^-1  will do.

----------------------------------------------------------------------
You ask about Rotman's discussion of the equivalence relation
of case (iii) of the example on p.219, and whether one of his "if"s
should be "if and only if".

He is correct in saying, in the third and fifth lines, "If  x == a"
and "if  x = ah" respectively.  In each of those sentences, he is
proving one direction of the double implication; together they prove
the whole result.

He could be criticized for saying, on the second line of that point,
"... given by  a == b if  a^-1 b\in H", rather than "... if and only
if ...".  Mathematicians tend, generally, to use "if" in definitions
when they mean "if and only if", for various reasons.  (One reason is
that a sentence "We shall say  X  if and only if  Y" sounds as though
when  Y  holds, we will be forced to say  X  even if it is irrelevant.)
But since "given by" is a somewhat abrupt way of making a definition, I
would have liked him to compensate by being more precise about the
implication.

----------------------------------------------------------------------
You ask about the term "pairwise disjoint" in the definition on p.220.

A family of sets is called "pairwise disjoint" if each pair of its
members is disjoint.  As you guessed, this is related to the term
"disjoint union" -- a set is called a "disjoint union" of a family of
subsets if it is their union, and they are pairwise disjoint.

(Similarly, a family of integers is called "pairwise relatively prime"
if each pair of them is relatively prime; etc.)

----------------------------------------------------------------------
You ask how, on p.221, two lines before Example 3.6, Rotman concludes
that  x\in A_i \intersect A_j.

He has assumed two lines earlier that  x\in A_i,  and on this line
that  x\in A_j.  Since it belongs to both these sets, it belongs
to their intersection.

----------------------------------------------------------------------
You ask about Rotman's reference to "Addition  F x F --> F" at the
beginning of the last full paragraph of p.222, and "multiplication
F x F --> F" at the end of that paragraph.

He is writing operations on the set  F  as functions on the product
set.  This idea was introduced on p.115, in the definition near the
bottom of the page.  If, after reviewing that definition and the
discussion following it, you still have questions, ask again!

(By the way, "F x F" is not called the "cross product" of  F  with
itself.  "Cross product" is a term only used in studying vectors
in  R^3.  It is called the "Cartesian product", or the "product set".)

----------------------------------------------------------------------
You ask why Rotman uses the same symbols,  sigma  and  tau,
for permutations and (as on p.226) for sequences.

There are only a limited number of letters, even combining the
Greek and Latin alphabets, so a reader of mathematics has to note
what meaning the author is giving to what letter in each context.
An author should likewise try not to use the same letter for
different things in the same context; but in different contexts
he or she is free to re-use symbols.

----------------------------------------------------------------------
You ask whether, one forms the set of polynomials  R[x]  over a ring  R
(p.226),  R  must be a ring that has only "constants".

Yes and no -- mainly no.  Any ring  R  can be used; in particular,
as Rotman mentions in the last paragraph of the section, one can
start with a commutative ring  R,  let  A = R[x],  and then
take polynomials over  A;  though in this case, to avoid confusion,
one calls the new indeterminate "y" rather than "x", and writes the
ring  A[y] = R[x,y].  This shows that we can use an  A  that consists of
polynomials, not "constants".  However, the partial "yes" is that
whenever we form a polynomial ring,  R[x],  A[y],  etc., we may regard
the elements of the original ring  (R,  A,  etc.) as "constants" in
the sense that they don't involve the new indeterminate  (x  in the
first case,  y  in the second).  So in  R[x,y],  an element like  x^2+1
is "constant" relative to  y.

----------------------------------------------------------------------
You note, regarding, p.227, 3rd line of proof to prop 3.17

> at first i wondered why g(r) could be distributed among the elements
> of f(r). the book had not yet written that R[x] is a commutative ring.

I hope that what I said in class cleared up this point --
Proposition 3.17 is not about  R[x],  but about  R,  which we
know from the start is a commutative ring.  It is simply there
to explain the sentence at the bottom of the preceding page, by
showing that the indicated multiplication formula holds in any
commutative ring, hence must hold in  R[x]  _if_ we are to make it a
commutative ring in which the i-th term of each sequence represents
the coefficient of  x^i.

----------------------------------------------------------------------
You ask for an example of nonzero polynomials whose product is 0.

Remember that by part (ii) of Lemma 3.18, p.227, this can happen
only when  R  is not a domain; i.e., when there are nonzero elements
r  and  s  in  R  such that  rs = 0.  In that case, the easiest
way to get an example of the sort you ask for is by taking such
a pair of elements  r  and  s  in  R  and regarding them as
constant polynomials.  You can see that in  R[x],  they still satisfy
rs = 0.  If you want an example where the factors are not constant,
you can take something like  (r X + r)(s X^2 - 2s X + s).

(The above examples have the property that every coefficient in the
first factor has product zero with every coefficient in the second
factor.  One can even come up with examples that don't have this
property.  If  R  is a domain with two elements  r  and  s  such
that  r^2 = 0  and  s^2 = 0,  but  rs  is nonzero, then we have
(rX + s)(rX - s) = 0,  even though the first coefficient of the
first polynomial and the second coefficient in the second have
nonzero product.  We haven't yet seen any examples of domains  R
with such a pair of elements  r  and  s,  but they do exist.)

----------------------------------------------------------------------
You ask why Rotman introduces the construction of polynomials as
sequences, only to return to the familiar expression of polynomials
as elements of the form "s_0 + ... + s_n X^n" in Proposition 3.21
(p.229).

Well, what definition of polynomial would you suggest?

As I emphasized in lecture, the "High School and calculus" concept
of a polynomial as a certain kind of function is inadequate, on the
one hand because it doesn't make clear in what way a function
on one field, say  R,  should be extended to a function on another
field, say  Q,  when we say something like "The polynomial  x^2 + 1,
which has no roots over the reals, does have roots in the complex
numbers"; and that approach is particularly inadequate in the case of
polynomials over finite fields, since, for instance, over  Z_p  we know
that the different polynomials  x^p  and  x  give equal functions.

On the other hand, if one wants to define a polynomial as a "symbol"
of the form  s_0 + ... + s_n X^n,  one has to be very careful how
one specifies the allowed set of symbols.  E.g., are  x+1  and  1+x
both allowed, and if so are they to be considered different symbols?

Rotman's sequence-of-coefficients approach captures the essential
information of what distinguishes one polynomial from another under
the above "symbol" approach, namely the coefficients of each power of
x,  and gives us one unambiguous mathematical entity corresponding
to each polynomial.

Do you have a better suggestion as to how Rotman could have cured the
impreciseness in the concepts of polynomial we had before this course?

Finally, in Proposition 3.21, he shows us that using the ring
operations and a certain choice of sequence  x,  we can go back to
representing polynomials in our old familiar notation -- but now with
a precise definition of what this notation means.

----------------------------------------------------------------------
You ask, regarding Rotman's comment about returning to the "usual role"
of  x  as a variable, at the bottom of p.230, whether  x  has to be an
element of the ring  R.

Well, the  x  in  f(x),  is, of course, not an element of  R.  But I
guess you meant the  a  that Rotman is substituting for  x  to get
f(a).  He is only considering the case  a\in R  in that paragraph,
but as you suggest, one can use elements  a  from other commutative
rings containing  R;  e.g., from a larger field if  R  is a field.
That is a case we will be leading up to soon!

----------------------------------------------------------------------
You ask why the field  Z_p (x)  is infinite (p.231, Prop.2.34).

First, the ring  Z_p [x]  is infinite because it contains (among
other things) the infinitely many monomials  1, x, x^2, x^3, ... .
The field  Z_p (x)  contains the ring  Z_p [x]  (or more precisely,
an isomorphic image of that ring), so it is also infinite.

That is the sort of point where I would hope students would look
at the object in question, ask themselves "Does the object seem
finite or infinite to me?  What kinds of elements does it consist
of?  How many of them are there?", and come up with the answer in
that way themselves.  I don't want to discourage you from asking
questions of the day, but I do want to encourage you to struggle
with such questions yourself a bit first.

----------------------------------------------------------------------
You ask in connection with Rotman's use of the term "subfield" in
Proposition 2.34, p.231, whether a subfield of a field  F  is the same
as a subring of  F;  that is, whether every subring  R  of  F  inherits
the property that all nonzero elements are units.

No; because being a unit in  R  means having an inverse in  R,  so
even though the nonzero elements of  R  have inverses in  F,  if
those inverses don't belong to  R,  then  R  won't be a field.

In particular, if we take any integral domain  R  (such as  Z)  which
is not a field, and let  F  be its field of fractions, then  R  will
be a subring of  R,  but not a subfield.

----------------------------------------------------------------------
You ask whether the statement that  Q  is the field of fractions
of  Z  together with the comment on fields of fraction in Example 3.7,
p.233, implies that  Z  is not actually contained in  Q,  but just
isomorphic to a subring of  Q.

A good question, but a complicated one to answer.

In general, mathematicians are not concerned with precisely what
set-theoretic tricks are used to construct a mathematical object
(once they are satisfied that such a construction can be achieved),
but with what properties that object has, especially when the
properties determine the object uniquely up to isomorphism.  So,
for instance, the field of fractions  F  of a domain  R  is determined
up to isomorphism by the properties that it is a field given with an
injective homomorphism of  R  into it, such that every element of  F
is a quotient of two members of the image of that homomorphism.  A
consequence is that if two mathematicians prefer different ways of
constructing the field of fractions, they can still talk about its
algebraic properties -- without foulup -- any algebraic statement
about  F  that is true under one mathematician's definition will be
true of the isomorphic object that the other has in mind, so the fact
that they constructed it differently makes no difference.

So one mathematician may do what I mentioned in class was possible:
First construct  Q  as a set of equivalence classes of ordered pairs,
then "cut" the images of the elements of  Z  out of  Q  and "splice
in" the elements of  Z  itself in their place, getting a field which
literally contains  Z;  another may say "Having constructed  Q,  we
will from now on use the name "Z" for the subring of  Q  isomorphic to
our original  Z", so that again,  Z  is literally a subring of  Q.  A
third may say "Having constructed  Q,  we now have two copies of  Z,
our original one, and the isomorphic one lying in  Q;  but since they
are isomorphic, we will not worry about the difference between them."

In Math 104 you probably saw the real numbers constructed from the
rationals, resulting in two copies of  Q,  the original one and the
isomorphic copy of  Q  inside the reals; moreover, there are two
different common ways to construct the reals from the rationals (by
Dedekind cuts and by equivalence classes of Cauchy sequences), so we
have several versions of the reals, each with a copy of  Q  in it.  And
when the complex numbers are constructed from the reals (which can also
be done in more than one way), one has, in addition to the reals that
one constructed earlier, the copy of the reals inside the complex
numbers -- with a copy of the rationals inside that, and a copy of the
integers inside that.

So, ultimately, one has to come to terms with the fact that when we
speak of something like "the integers" or "the real numbers", we are
not so much referring to a specific set constructed in a specific way,
but to any object with a certain set of properties; and that we speak
of this as though it were unique out of convenience -- but that this is
OK, because it is the properties that we use, not the specific
construction.

(This is not to say that the question of how such an object can be
constructed is unimportant, or that we should forget it once we see
it done.  The construction provides a proof that such an object exists,
and having seen it done, we are in a position to devise similar
constructions when we want to construct objects with other properties.)

----------------------------------------------------------------------
You ask what the difference is between an ideal (p.235) and a vector
space.

You can answer that by looking at the two definitions, and for each
condition in one definition, asking whether it is true in the other.

The most important two differences are the following:  (a) In the
definition of a vector space, the things that one multiplies by come
from a field, while in the definition of an ideal, they come from
a commutative ring, which may not be a field, and (b) in the definition
of an ideal  I,  this  I  is a subset of the ring  R,  while in the
definition of a vector space  V  over a field  F,  V  is not, in
general, a subset of  F.

Note that you would not find the above differences by going through
points "(i), (ii) and (iii)" of the definition of an ideal.  When you
read a definition, you need to start at the beginning, in the case,
with the words "An ideal in a commutative ring  R  is a subset of  I
such that"!

----------------------------------------------------------------------
You ask how, on p.240, two lines before the first display, Rotman
deduces, from the fact that  s_n  is nonzero, that it is a unit.

This is because  s_n  belongs to  k,  which is a field.  Check
the definition of "field" if this does not make the fact clear.

(The division algorithm does not work in rings  A[x]  where  A  is
not a field; e.g., when  A  is the ring of integers.)

----------------------------------------------------------------------
Regarding the statement on p.240, above the long middle display, that
h=0 or deg(h)<deg(r), you ask why  deg(h)  couldn't equal  deg(r).

Because  deg(r) = m,  and the second term in the preceding line is
designed so that its leading term cancels the leading term of  r(x).

Hmm -- but I see that the coefficient that Rotman writes in the first
display on that page is all wrong:  "s_{n-m}" should be "s_n" and
"x^{n-m}" should be "x^{m-n}".  I'll add that to the errata I send
him!  (The expression for the same element in the second display is
correct.)

----------------------------------------------------------------------
You ask whether, on p.243, top, where Rotman proves Corollary 3.34,
he is proving that  e^(2 pi i k / n)  is an n-th roots of unity.

No; he is using the fact proved earlier, that all of these elements
are nth roots of unity, and using Theorem 3.33 to deduce that they are
the _only_ nth roots of unity -- because there are  n  of them, and
we know from the Theorem that the equation  x^n - 1  can't have more
than  n  roots.

----------------------------------------------------------------------
Concerning the statement of Corollary 3.38, p.245, you ask whether
uniqueness isn't implied by the word "greatest".

To answer that, you need to think about what meaning the word
"greatest" has in the phrase "greatest common divisor", since
when a word is made part of a definition, it often has a different
meaning from its usual one.  In the definition on p.243 we see that
"greatest" here means "having largest degree".  Now different
polynomials which are common divisors of  f  and  g  may have the
same degree, so it is not obvious that having largest degree
guarantees uniqueness.

In fact, as I pointed out in class, gcd's of polynomials would not
be unique if we didn't also have the word "monic" in the definition.
With that condition they are unique, but not for an obvious reason.
It takes the reasoning of the last paragraph of Corollary 3.38 to
show that they are.

----------------------------------------------------------------------
Regarding p.245, sentence after proof of Corollary 3.38, you say

> I see that the set of all linear combinations of f(x) and g(x)  is an
> ideal in k[x], but I am not sure how it follows from the existence of
> gcd's in Z (the integers).

That's not what he means.  Note the word "as" after the semicolon in
the first line.  He is saying the situation is analogous to the one
in  Z,  which also corresponds to considering an ideal (in that case,
an ideal in  Z).  I agree that he could have been more clear, though.

----------------------------------------------------------------------
You ask what Rotman means on p.251, end of Example 3.16, when he
says that  x+1  is "2x+2 made monic".

The steps of the Euclidean algorithm that he has shown give an
element,  2x+2,  which is a linear combination of  f  and  g,  and
which divides both  f  and  g.  To be the gcd of  f  and  g,  it
requires just one more property:  that it be monic.  This property
can be achieved by multiplying it by the inverse of its leading
coefficient,  2.  (The two properties mentioned above are not lost
when this is done.)  So multiplying  2x+2  by the inverse of  2,
we "make it monic" and get the result  x+1.

----------------------------------------------------------------------
You ask how, at the end of the middle paragraph on p.254, the inequality
curly-d ( beta(u+vi) ) < curly-d (beta)  shows that the second property
for a degree function is satisfied.

In the computations Rotman has been making,  beta(u+vi)  has the
role of the "r" in the second condition defining a degree function.
Showing that  curly-d  of this complex number is  < curly-d (beta)
shows that that complex number is either  0,  or is a nonzero element
on which  curly-d  has smaller value than it has on  beta  (which
corresponds to the  g  of that definition).  This verifies the last
line on p.252.  You should check that the computations which have
preceded give the conditions mentioned on the preceding lines.

----------------------------------------------------------------------
You ask whether it is not wrong for Rotman to say on p.257, top line,
that  alpha  belongs to the set of linear combinations  I =
{sigma alpha + tau beta : sigma, tau \in Z[i]}.  Namely you suggest
that in setting  sigma = 1  and  tau = 0,  one is dropping the  beta
term, and leaving a result that is no longer a linear combination.

First of all, the sense of a mathematical term is determined by the
definition one gives of it, and the definition of a "linear combination"
of two elements of a ring  R,  given on p.212, in part (ii) of
Lemma 3.8, allows any coefficients in  R,  not excluding  0.  So
based on the definition,  1 alpha + 0 beta  is indeed a linear
combination of  alpha  and  beta.

One can, of course, ask whether it was appropriate to make that
definition.  There are two versions of this question:  Is the concept
defined on p.212 a more or a less natural mathematical concept than
the one one would get by excluding the coefficient 0, and is it
reasonable to use the English words "linear combination" for the set
so defined?

In most mathematical contexts, unless there are special reasons for
excluding degenerate cases, the concepts one gets by not excluding them
turn out to be more natural than those one gets by excluding them.
For the case of "linear combination", this is most easily seen in the
context of linear algebra:  If  x  and  y  are vectors, then the set of
all linear combinations of them with coefficients in the base field is
a subspace, while if one excludes combinations where one or the other
coefficient is 0, one generally gets something that is a vector space
minus two lines, a much less basic mathematical object.

As for reasonable use of English words, most English words, as they are
ordinarily used, exclude degenerate cases.  For instance, the everyday
meaning of a "set" of things excludes the empty set and sets with just
one element.  Even when there are exactly two elements, one generally
doesn't call the result a "set" but a "pair", so one could argue that
by everyday usage, a "set" should have at least three elements.  To
avoid such problems, we could make up words for mathematical concepts
using random combinations of sounds; but it is considered preferable
to use English words which approximate the general meaning of a
mathematical concept, even if special cases have to be treated
differently.

----------------------------------------------------------------------
You ask how, in the paragraph before Lemma 3.50 on p.257, one can
say that if  n  is an odd number, then it is congruent to either  1
or  3  mod 4.

Any integer is congruent to either  0, 1, 2  or  3  mod 4.  (See
Corollary 1.47, p.64).  This means it is of one of the forms
4m, 4m+1, 4m+2, 4m+3  (m\in Z).  If it has one of the forms  4m
or  4m+2,  it is even (since  4m = 2(2m)  and  4m+2 = 2(2m+1)).
So to be odd, it must have one of the remaining forms,  4m+1  or
4m+3;  i.e., it must be congruent to  1 or  3  mod 4.

----------------------------------------------------------------------
You ask where  x^2 - 1  comes from in the 5th-from-last line on p.257.

Rotman has just been talking about the set of elements of  Z_p  that
satisfy  a^2 = 1.  This equation is equivalent to  a^2 - 1 = 0,
which is equivalent to saying that  a  is a root of  x^2 - 1.
(The "magic" of this proof is that it starts by looking at  a^2 = 1
as an equation in the group  (Z_p)^x  and applying group theory to
discuss the number of elements  a  satisfying that equation; then
switches to looking at it in ring-theoretic terms, and applies the
result on the number of roots of a polynomial to it.)

----------------------------------------------------------------------
You ask, regarding the first full sentence after the second display on
p.258, how one deduces that  a^2 + b^2  is not congruent to 3 mod 4.
Note that in that display, Rotman computed the squares of  0, 1, 2, 3
mod 4.  Since any integer is congruent to one of those four values
mod 4, every integer has square congruent to one of the results
obtained there, namely  0  or  1.  If you add  0 or 1  to  0 or 1,  you
can get  0, 1 or 2,  but not  3; so the sum of two squares can't be
congruent to  3  mod 4.

----------------------------------------------------------------------
You ask why, three lines after the 4th display on p.258, the condition
p | (m+i) - (m-i) = 2i  gives a contradiction.

If a Gaussian integer  a+bi  is divisible by an integer  r \in Z,  that
means that  a+bi = (c+di)r  for some integers  c  and  d,  from which
we see that  r  divides both  a  and  b  in  Z.  In particular, if
p | 2i  in  Z[i],  then  p | 2  in  Z.  At the point you ask about, we
are in the case where  p  is an odd prime number, so it is not a divisor
of  2;  so an argument that says that it is gives a contradiction.

----------------------------------------------------------------------
You ask why Rotman talks about  Z[zeta_p]  and Fermat's Last Theorem
in the same paragraph.

He shows the reason on the middle of p.263.  If  a  and  b  are
integers, and we abbreviate  zeta_p  to  zeta,  then the integer
a^p + b^p  can be factored in  Z[zeta]  as

	(a + b) (a + zeta b) (a + zeta^2 b) ... (a + zeta^{p-1} b).

(Cf. Exercise 3.74, which, by the way, will be in next week's homework.)

So if we had  a^p + b^p = c^p  (the sort of equation that Fermat's
Last Theorem says can't occur), this would give two factorizations
of the same element:  the one shown above, and the factorization
c . ... . c  (p factors).  Now if  Z[zeta]  were a unique factorization
domain, we could say that the irreducible factors of the terms
appearing on the two sides of that equation were equal, and by
arguments working from there, we could get a contradiction.

Anyway, I hope you realize that this material is not in your reading
assignment, as noted on the homework sheet with the homework due
tomorrow.

----------------------------------------------------------------------
You ask why, in the first sentence of the Remark starting at the
bottom of p.270, all the coefficients of  h(x)  are zero in  Z_p.

Remember that an integer is "zero in  Z_p" (more precisely: has
image equal to 0 under the natural map  Z --> Z_p)  if and only
if it is divisible by  p.  So the author is saying that all
coefficients of  h(x)  are divisible by  p.  (By all primes  p?
No, you can see that the sentence ends with the words "for some
prime  p.")  This is the same fact that he used at the beginning
of the proof of Lemma 3.59, so see the second sentence of that proof
for the reason this holds (after making sure you understand the
definition of primitive, and hence what it means for  h(x)  not to
be primitive).

----------------------------------------------------------------------
You ask about the transition from the first to the second equation
in the proof of Corollary 3.62, p.272.

Well, you've found another typo in Rotman!  The "f(c)" should
just be "f(x)", or better, "c(f) f*(x)".  (I wish you'd e-mailed
your question to me before class; then I could have announced
the correction.  Anyway, I'll send it to Rotman with the rest.)

----------------------------------------------------------------------
You ask whether Gauss's Theorem (p.272) can be applied to any other
ring-and-field pair than  Z  and  Q.

Yes:  If  R  is any principal ideal domain (which we have seen
defined) or more generally, any unique factorization domain (which
we will see soon), the corresponding result holds.  The proof is
really the same, but we don't give it in 113 so as not to overload
the students with generality.  But we do give the general form in 250A.

----------------------------------------------------------------------
You ask about my argument Monday that  f(x) --> f(x+c)  was a
homomorphism  Z[x] --> Z[x]  (Rotman's Exercise 3.42, used in
the proof of Lemma 3.66, p.276),  in which I drew pictures of graphs.

As I said in class, that was not a proof.  It was just a way of
getting intuition about that map by relating it to things one was
familiar with.

The essence of the proof of that result is to apply Prop.3.17 (p.227),
not in the ring  R  but in the ring  R[x],  with  x+c  in the role
of the element "r",  and the coefficients of  f  and of  g  in the
roles of the  s_i  and  t_j.  Notice that the elements  a_k  (defined
on the line after the display in the statement of that Proposition)
are the coefficients of  fg.  Hence the formula displayed there
shows that  f(x+c) g(x+c) = (fg)(x+c).

A similar, but much shorter computation shows that  f(x+c) + g(x+c)
= (f + g)(x+c);  and the fact that the map under consideration takes
1  to  1  is immediate.

----------------------------------------------------------------------
You ask whether the result that each cyclotomic polynomial  Phi_p(x)
is irreducible (Corollary 3.68, p.277) can be proved more simply by
showing that these polynomials are all irreducible  mod 2.

Unfortunately, they aren't.  For instance, though  Phi_7(x) =
x^6 + x^5 + x^4 + x^3 + x^2 + x + 1  is irreducible in  Z[x],  it
factors in  Z_2 [x]  as  (x^3 + x + 1)(x^3 + x^2 + 1).

----------------------------------------------------------------------
After thinking about both your questions on the proof of Proposition
3.76 (p.285), it suddenly struck me where the difficulty comes from:
Rotman confuses three different maps under the name "phi" in the
statement and proof!

Let's understand "phi" to be defined as he does at the start of
the proof, the evaluation map  k[x] --> K  taking  x  to  z\in K.
As he shows in that part of the proof, its kernel has the form  (p(x)),
so by the first isomorphism theorem, it image is isomorphic to
k[x]/(p(x));  i.e.,  phi  can be written as a composite

		k[x] --> k[x]/(p(x)) =~ im phi \subset K.

Now the Proposition he quotes shows that  k[x]/(p(x))  is a field, so
im phi,  being isomorphic to that field, is a field, hence is a subfield
of  K  containing  phi(x) = z;  but since every element of  k[x]/(p(x))
has the form  g(x)+I,  every element of  im phi  has the form  g(z),
so every element of  im phi  lies in all subfield of  K  containing  z,
so  im phi  must be the smallest subfield of  K  containing  z,  i.e.,
k(z).  So the isomorphism in the middle of the above display becomes
k[x]/(p(x)) =~ k(z),  which prove (ii).

Rotman seems to be using   phi  for that isomorphism in statement (ii),
for the map I have called  phi  at the beginning of the proof, and for
the map  k[x] --> k[x]/(p(x))  at the top of p.285.

----------------------------------------------------------------------
You ask about the relationship between the unique subgroup of
order  p  in the proof of Theorem 3.78 and the whole field of
p^n  elements in Corollary 3.79 (p.286).

First of all, the "p"s in those two results are different:  In the
first,  p  ranges over the divisors of the order of the multiplicative
subgroup  G;  in the second, it is the characteristic of the field  k.
When we have finished proving the Corollary, we know that  k  has
p^n  elements, hence its multiplicative group, consisting of all its
nonzero element, has order  p^n - 1.  If we want to apply the Theorem
at that point, what it tells us is that for each prime  p'  dividing
p^n - 1,  the group  k^x  has exactly one subgroup of order  p'.

----------------------------------------------------------------------
You ask about Rotman's statement on p.286 that  k  is an n-dimensional
vector space over  Z_p.

I guess that when you took 110, the only fields considered were the
real and complex numbers.  But one can do linear algebra over any field.
See the definition on p.302.

----------------------------------------------------------------------
You ask about Rotman's computation "p^n q^{n-1} - 1 = -1" on the
5th line on p.287.

This is because  F  is of characteristic  p,  and multiplying an
element of  F  by  p  is the same as multiplying it by  1+...+1
(p times) which equals 0 in  F.  Hence the same holds for multiplying
it by any multiple of  p.

----------------------------------------------------------------------
You ask whether a p-subgroup (p.398) just means a p-group that
is a subgroup of  G.

Right!

----------------------------------------------------------------------
You ask how, if a Sylow p-subgroup of  G  means a maximal p-subgroup
(p.398), a group can have more than one Sylow p-subgroup.

Look at the sentence after that definition, beginning "Maximality
means ...", then consider the example  G = S_3,  and its subgroups
<(1 2)>,  <(1 3)>  and  <(2 3)>.   Each of these is a 2-Sylow subgroup;
it has the property of being maximal among 2-subgroups of  G,  as
defined on there.  But none of them has the property of containing
all 2-subgroups (since they don't contain each other), which I suspect
you were thinking "maximal" meant.  If there were a 2-subgroup with
that property, it would be called the "greatest" 2-subgroup of  G,  and
it would then, as you say, be the only Sylow 2-subgroup of  G.

The distinction between "maximal" and "greatest" is an important one,
not just in this topic of this course, but in mathematics generally,
so it is worth absorbing.

----------------------------------------------------------------------
You ask, regarding the concept of Sylow p-subgroup (p.398):

>I was wondering, in general, what is useful or important about sylow-p 
>subgroups.  It doesn't seem like maximal subgroups of order power of p
>are something you would come across very often.  Do they have some
>deeper usefulness or importance than just the theorems in Rotman?

They are one of the important keys to further study of the structure
of finite groups!  If, for instance, one wants to classify all groups
of order  1000 = 2^3 x 5^3,  then knowing that such a group must
have subgroups order  2^3 = 8  and  5^3 = 125  allows one to start
by studying groups of those two orders.  Then, for each group  G  of
order 1000, one knows that there must be a subgroup of the first sort
and one of the second; and moreover that all subgroups of order  8
will be conjugate, hence isomorphic to one another, and the same
for all groups of order 125.  So in trying to classify such  G,  we can
begin by listing the isomorphisms class of groups of order  8  and the
isomorphism classes of groups of order 125.  Then, for each pair of
possibilities, one can try to figure out how they can be "fitted
together" -- in what ways one build a group of order  1000  whose
Sylow 2-subgroups are of the ifrst sort and whose Sylow 5-subgroups
are of the second.

However, we are not going to be pursuing the structure of finite
groups any further in this course.  So rather than Sylow subgroups
being a doorway into new results, they are the point at which we
are leaving the subject.  I had hoped that the proof of the Sylow
Theorems would give some taste of the beauty of the field.

----------------------------------------------------------------------
You point out that at the top of p.399, Rotman allows  {1}  as a
p-group, while his definition on p.188 requires that the order be a
positive power of  p,  and you ask which definition we will be
following in this course.

You are right that he is inconsistent.  Whether to word definitions
so as to include or exclude trivial cases is a kind of question on
which mathematicians often differ (e.g., whether the set of natural
numbers includes 0).  I would prefer to count the trivial group as
a p-group.  However, we are leaving group theory as of the next reading,
and the question will not come up on the midterm, so with respect to
this course it is now moot (except for the Final Exam, and I will try
to avoid having any questions there whose answer would depend on which
definition one uses).

----------------------------------------------------------------------
You ask whether, in the top paragraph of p.399, as one constructs
larger and larger p-subgroups of  G,  one can ever get  G  itself.

Certainly.  If  G  is a p-group, then one must always get  G  in
the end.  If it isn't, then one won't.

----------------------------------------------------------------------
You ask how in the proof of Lemma 5.15, p.400, "the conclusion made by
Exercise 2.89 gives that S  is a p-subgroup of  N_G(P),  strictly
larger than  P".

That Exercise is only being used to conclude that  S  is
a p-group.  Do you see that it gives that conclusion?  (The
exercise is a trivial application of the formula relating the
orders of  G,  H,  and  G/H.)  The statement that  S  is
a subgroup of  N_G(P)  comes from the correspondence theorem;
that it is strictly larger than  P  comes from the fact that
it contains  a.

----------------------------------------------------------------------
You ask how there can be more than one Sylow p-subgroup, for a given
prime  p  (p.401, Theorem 4.16(ii)).

There often are more than one; e.g., in  S_3,  the subgroups
<(1 2)>, <(1 3)> and <(2 3)>  are the Sylow 2-subgroups.

The thing that is most likely the source of your confusion is the
word "maximal" in the definition.  But if you look at the definition
of "maximality" beginning at the bottom of p.398, you will
see that each of the above subgroups satisfies it.

This sense of the word "maximal" is standard in mathematics.  In
contrast, an element that actually contains all others of the sort in
question is called a "greatest" element.  So in  S_3,  there are three
maximal 2-subgroups, but no greatest one.

----------------------------------------------------------------------
You ask about the correspondence theorem for rings (p.438), and why it
concerns ideals rather than subrings.

The statements for ideals and for subrings are both true; but the
subrings case turns out to be of much less use than the ideals case;
so that is the only one Rotman states.

----------------------------------------------------------------------
You ask how the Correspondence Theorem shows that every ideal
of  Z_m  has the form  ([a])  for some divisor  a  of  m  (p.439,
Example 6.1).

For me to answer a question like this, you need to tell me how
far you were able to get in figuring it out for yourself.  The
Correspondence Theorem concerns the relationship between ideals
of a ring  R  and ideals of a factor ring  R/I.  Could you see
what the ring  R  and ideal  I  to which it must be applied
here are?  If not, you should have asked what ring and ideal the
theorem was being applied to.  Hopefully, you could see that since
Z_m  simply means  Z/(m),  Rotman must be applying the Correspondence
Theorem to the ring  Z  and ideal  (m).  It then says that ideals
of  Z/(m) = Z_m  correspond to ideals of  Z  that contain (m).  Could
you figure out what these ideals were?  If you did, was your problem
in seeing what ideals of  Z_m  these corresponded to ... ?

As I say in the course handout, where I describe the "Question of
the Day", if you were in my office I could question you and see what
needed to be clarified.  But that's time-consuming to do by e-mail;
so it's your job to pinpoint what your difficulty is, and let me know
in your question.

----------------------------------------------------------------------
You ask whether one can think of a prime ideal (p.439) as an ideal
generated by an irreducible element.

In a PID, the prime ideals are the ideals generated by the irreducible
elements, with one exception:  {0}  is a prime ideal, but  0  does not
fit the definition of "irreducible element".

But in a ring that is not a PID, this does not work.  For instance,
in the ring  R[sin x, cos x]  that I mentioned in lecture last week,
sin x  is an irreducible element, but the ideal that it generates
is not prime, since  (1+cos x)(1-cos x) = (sin x)^2,  which belongs
to the ideal even though neither  1+cos x  nor  1-cos x  does.  On
the other hand, the ideal of functions  f(x)  in this ring that
satisfy  f(0) = 0  is easy to show to be prime, but it cannot be
generated by a single element.  (It can be generated by  sin x  and
1-cos x.)

----------------------------------------------------------------------
You ask whether the deduction made in the sentence beginning at the
bottom of p.441 and ending at the top of p.442 is based on a principle
that if a ring is isomorphic to a domain, then it is a domain.

Definitely!  That property should be easy to see.

You continue, "If so, what other properties of a ring can we deduce
through isomorphism?", for instance, whether this is so of the property
of being a PID.

Yes; that property, and every property that can be stated entirely
in terms of properties of the ring operations.  (An example of a
property that is not preserved under isomorphism is that of containing
the integers as a subring; but the property of containing a subring
isomorphic to the integers is preserved under isomorphism.)  All of
the properties to which we have given names are defined in terms of
the ring operations, so all of these are inherited by isomorphic rings.

----------------------------------------------------------------------
You ask about the definition of associates (p.445), and the concept
of "unit" that it involves, concluding "does not that mean that any
two elements in  Z  are associates?"

I can only guess why you think it means that.  If you look at the
definition of "unit" on p.213, you will see that the only units in  Z
are  +1  and  -1.  So if  n  is any integer, its only associates in
Z  are  n  and  -n.

On the other hand, all nonzero integers become units when one works
in the field  Q,  so any two nonzero integers are associates in Q.
When one is talking about  Z, "units" and "associates" means "units
in  Z" and "associates in  Z";  when one is talking about  Q  they
mean "units in  Q" and "associates in  Q".  Rotman emphasizes this
distinction on p.213 in the second paragraph after the definition of
"unit".  One of the good things about Rotman as a text is that he
does point out these distinctions one needs to be aware of.

----------------------------------------------------------------------
Regarding Lemma 6.10 on p.446, you asked for an example of a ring
(necessarily not a PID) having an infinite strictly ascending chain
of ideals.  I guess I didn't make it explicit, but the example I gave
in class of a ring with elements that cannot be factored in any way into
irreducibles, namely the ring of polynomials over the rationals whose
constant terms are integers, has this property.  Such a chain is given
by the principal ideals generated by the successive factors of  x  that
I pointed out, i.e.,

	(x) < (1/2 x) < (1/4 x) < ...

The inclusions are strict because  1/2 x  does not belong to (x), etc.
(since there is no  a  in the ring such that  a x = 1/2 x.)

----------------------------------------------------------------------
You ask about the field  Q = k(y)  in Example 6.6, p.448.

You should review p.231.  This is essentially the same as that
example, but with "k" in place of "F", and "y" in place of "x".

----------------------------------------------------------------------
You ask about the difference between the definitions Rotman gives
of "content" for polynomials over the rationals, and polynomials
over a UFD  R  (p.451).

To be more precise about what I said in class:  On p.451 it would
have been better if he had defined content for all nonzero polynomials
over the field of fractions  Q  of  R,  not just those polynomials
whose coefficients are actually in  R.  Then the definition of  c(f)
would have been the element of  Q  (unique up to multiplication by
a unit of  R)  such that, when we write  f = c(f) f*,  the polynomial
f*  is a primitive polynomial with coefficients in  R.  In the
case  R = Z,  our old definition would be essentially the same as this,
the one small difference being that instead of letting it just be
unique up to multiplication by a unit of  Z,  i.e., 1 or -1, we made
it unique in that case by choosing it to be positive.

----------------------------------------------------------------------
You ask, regarding the first display in Corollary 6.18, p.454,
"What exactly are  G  and  H?".

What Rotman says is "If  f(x) = G(x) H(x)  in  Q[x] ...".  Since
he has already taken  f(x)  to be a member of  R[x],  he is now
saying "Suppose we have a factorization of  f(x)  as a product of
two members of  Q[x]".  So  G  and  H  are any two elements of
Q[x]  whose product is  f(x);  so they are polynomials with
coefficients in  Q.

----------------------------------------------------------------------
> On pg. 454, Rotman gives a proof to corollary 6.18.
> He notes that G star and H star are primitive
> polynomials in R[x]...how does he know this?

Essentially, this is Lemma 6.15(i); except that Rotman has
forgotten that he only defined "content" for polynomials in  R[x],
not for general polynomials in  Q[x], and hence, necessarily,
has only proved Lemma 6.15 in that case.

The best way to have handled things would have been to define
content and primitive polynomial and prove Lemma 6.15 for
polynomials over  Q[x].  In class, I showed a way of fixing this
particular proof without going back and making those changes:
Write  G(x) = b^-1 g(x)  with  b \in R - {0},  g(x) \in R[x],
and then apply Lemma 6.15(i) to get  g(x) = c(g) g*(x)  with  g*(x)
primitive in  R[x];  and do the same for  H(x).

> Also, he says that the content of G multiplied with the content
> of H equals the content of f. ...  Rotman seems to assume they
> are associates in R ...

You're right that "equal" should be "associates".

Assuming that one proves Lemma 6.15 for polynomials in  Q[x],  it will
say that for a polynomial  f(x)\in Q[x] - {0}  one has a decomposition
f(x) = c(f) f*(x)  with  c(f)\in Q-{0}  and  f*(x)  primitive in  R[x],
which is unique up to multiplication by _units_of_ R;  so two possible
values of  c(f)  will indeed be associates in  R  if they lie in  R.
To see what the corrected statement should look like, look at Lemma 3.60
on p.271:  Notice that that lemma does not just concern polynomials
over the field of rational numbers, but the relationship between
polynomials over the field of rational numbers and polynomials over the
integers.  If one removes from that lemma the condition that c(f)\in Q
be positive, then the resulting factorization is unique up to
multiplication of the terms by  +-1.  Here  +-1 are precisely the
units of  Z.  (The units of  Q,  in contrast, comprise all nonzero
rationals.)  Similarly, the corrected version of Lemma 6.15 would
concern the relation between polynomials over the field of fractions
Q  and polynomials over the original ring  R,  and the results would be
unique up to multiplication by units of  R.

----------------------------------------------------------------------
You ask about the statement on p.455 just before the exercises, that
xy^2 + z  is a primitive polynomial.

Rotman should be more explicit, and say that viewed as a polynomial
in  x  over  k[y,z]  it is primitive.  As such, the coefficients
are  y^2  and  z.  In the UFD  k[y,z],  y  and  z  are irreducibles
that are not associates, so  y^2  and  z  have gcd equal to  1,  so
xy^2 + z  is primitive.

----------------------------------------------------------------------