ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 250A
Fall 2002 and Fall 2006
taught from Lang's "Algebra", 3rd ed.
(The "Companion" referred to in places below is some supplementary
material I provide for my students. Pagination in the Companion
changes slightly from year to year, but in the answers below, I try
always to include the page of Lang to which a note in the Companion
refers, which should remain constant. The answers below are arranged
according to the page in Lang referred to, even though in class we
modify that order in places, e.g., interpolating sections from the
chapter on modules between sections of the chapter on rings.)
----------------------------------------------------------------------
Regarding Lemma 0.c1 in the "Companion" (referring to p.x of Lang),
you ask whether there is a general concept of algebraic structure, such
that if A has such a structure, one can give criteria for such a
structure to be induced on A/~, and get a similar result with
"homomorphism" in place of "set map".
Yes. Sets with algebraic structure in that general sense are the topic
of the field variously called "universal algebra" or "general algebra".
It is Math 245 here at Berkeley; cf. my course notes,
http://math.berkeley.edu/~gbergman/245 .
Algebraic structures in general are not formally defined until
Chapter 8 of those notes, but lots of examples are given in Chapter 3
to motivate the later development. In particular, in section 3.2,
it is observed (as we will see in reading #3) that the equivalence
relations on a group G induced by group homomorphism are in
one-to-one correspondence with the normal subgroups of G; while
in section 3.10 of those notes it is shown that the corresponding
result is _not_ true for monoids: The equivalence relation determined
by a homomorphism is not determined by the set of elements the
homomorphism sends to the identity. However, a precise description is
given of which equivalence relations ~ on a monoid M are induced by
homomorphisms (equivalently, which equivalence relations ~ have the
property that M/~ can be made a monoid so that the canonical map is a
homomorphism). These are called "congruences on M", and the analogous
definition and concepts for general algebras (which are straightforward
once one has seen the monoid case) are set down in Chapter 8.
----------------------------------------------------------------------
You ask about additive and multiplicative notation (Lang, p.3).
As you observe, there are certain objects for which one or another
operation is traditionally written with the symbol "+" or ".", and
this tradition is so strong that people would not think of writing
it differently; in particular, the real numbers and various related
objects have two operations, one of which is traditionally written
"+" and the other ".", so that changing notation for these would be
very confusing.
However, if one looks at a new group to which no such tradition
applies, one could denote its operation by either symbol. Since
traditionally-defined operations include some that are written
multiplicatively but have noncommutative operation (e.g., matrices),
while everything traditionally denoted "+" is commutative, one makes
the convention of not using "+" unless the operation is commutative;
so a commutative operation can be denoted by either symbol, while a
noncommutative group or monoid operation can only be denoted "." (or,
occasionally, by an ad hoc symbol such as "*".)
More important than the fact that different groups can have operations
written with different symbols is the fact that in studying groups, one
will often be making statements concerning an arbitrary group; and
since that group need not be commutative, one will generally write the
operation ".". Since "arbitrary group" includes the case of groups
such as R under the operation of addition, such results are applicable
to these cases, so one must be able to view a statement about "." as
a generic statement which includes cases written "+" (and, inversely,
when one proves a result about abelian groups and writes the operation
"+", one must be able to view this as including abelian groups whose
multiplication is written ".".)
I can't tell whether you need still more persuasion to accept "+"
and "." as two possible symbols for a group operation, rather than only
as names for distinct operations with distinct meanings. Let me know.
(When we come to ring theory in reading #12, we will be back in the
situation where "+" and "." have distinct, non-interchangeable
meanings.)
----------------------------------------------------------------------
You ask about conditions for a monoid (p.3) to be embeddable in a group
(p.7).
Mal'cev obtained a necessary and sufficient criterion. It involves an
infinite set of conditions, one obtainable from each "formation" of two
kinds of brackets, "()" and "[]". (E.g., if I recall, ([)] is such a
"formation".) Cf. P.M.Cohn, "Universal algebra", section VII.3.
----------------------------------------------------------------------
You ask whether the result stated in italics at the top of p.5
could be proved, alternatively, by showing that every permutation
\psi is a product of transpositions of adjacent elements.
Yes. Doubtless the reason that Lang didn't do it that way is that
one thinks of this fact as a result in group theory (it appears
as Exercise 38(b) at the bottom of p.78), so he didn't want to
state it until he had defined groups, and the symmetric group in
particular.
However, another way to do things would certainly have been to prove
this statement about transpositions here, as a lemma on the way to
getting this result about commutative monoids, and then note that one
already has it available "with no extra work" when one wants it in the
study of the symmetric group later on.
----------------------------------------------------------------------
You ask about the term "family" on p.9, first Example, 4th-from-last
line.
"Family" is a general term mathematicians use to mean "elements
collected together in some way". One way they can be collected is
as a set; another way is as an I-tuple; so "family" can mean either
of these. In this case, it means "I-tuple".
----------------------------------------------------------------------
You ask whether the square is the only shape whose symmetry group
is "the group of symmetries of the square" (Lang, p.9, bottom).
In a narrow sense, certainly not. If we start with a square
centered at 0 in R^2, its symmetries will be a certain group
G of linear maps R^2 --> R^2. Now if we take any geometric
object X in the plane (e.g., a triangle), and "symmetrize" X
using G, i.e., take the union Y of the images of X under all
the elements of G, then in general, G will be the group of
symmetries of Y. (In some special cases, Y will have more
symmetries; but in "most" cases it will not.)
However, one can interpret the question in other ways. One of them
is "Is the above representation of the abstract group D_4 as the
concrete group of linear maps of the plane that represent symmetries
of the square essentially unique?" Again, the answer is no: There
are faithful representations of D_4 by linear maps on R^n which
are not built up in obvious ways from that representation.
On the other hand, one can ask questions in which "unique" is
replaced by "simplest", and then one tends to get positive answers.
For instance, the smallest number n of elements such that D_4
can be represented faithfully by permutations of n elements is 4,
and one can show that every faithful 4-element G-set is isomorphic
either to the G-set of vertices of the square, or to the G-set of
edges of the square.
----------------------------------------------------------------------
You ask what I mean by "left translation" and "right translation" in
the discussion on "alias and alibi" in the Companion (note referring
to Lang's p.13).
What Lang calls the "translation map" T_x determined by x at the
bottom of p.26 is more precisely called the "left translation map"
determined by x, since it takes each y and multiplies it on the
left by x. Likewise, the "right translation map" takes each y
and multiplies it on the right by x, giving yx. Lang implicitly
invokes these in referring to right cosets, at the top of p.27,
though he doesn't give them a name.
(Hmm. I should probably make the discussion of "alias and alibi"
a comment to section I.5, rather than I.2.)
> ... Does the problem of alias and alibi often arise in group theory?
Probably less often than it used to, since group theorists are now
clear that the fundamental concept of a permutation of a set should
be defined as a bijective map of the set to itself. But in elementary
expositions, it is still tempting to represent such actions as
moving an object against a background, e.g., shifting marbles among
holes; so it can still be a problem for beginners.
----------------------------------------------------------------------
You ask why, in the next-to-last line on p.13, one has
xH = f^{-1}(f(x)); specifically, why the right-hand side is
contained in the left-hand side.
If y is an element of f^{-1}(f(x)), that means f(y) = f(x). Now,
following the heuristic I noted in class, one takes this equation
saying that x and y "behave alike" (under f) and transforms it
into on that says that a certain element behaves like the identity.
One can do this by multiplying either on the right or on the left by
f(x)^{-1}. Performing those two multiplications, and using the fact
that f respects products and inverses, we get, on the one hand,
f(yx^{-1}) = e, and on the other hand, f(x^{-1}y) = e. These say
yx^{-1}\in H, respectively x^{-1}y\in H, in other words, y\in Hx,
respectively y\in xH, showing that f^{-1}(f(x)) is contained in
both Hx and xH, as the equation you ask about, and the other one
that Lang relates it to, require.
----------------------------------------------------------------------
You ask how we know that the homomorphisms of conditions (i) and (ii)
on p.16 are unique.
"Unique" in each case means "the only homomorphism that satisfies
the stated conditions". In each case, write out the condition that
Lang says the homomorphism is to satisfy, as an equation in group
elements. I think you will see that that equation determines
how the homomorphism must act. If not, write to me (or show me in
office hours) how far you have gotten with the calculation, and I will
show you what comes next.
----------------------------------------------------------------------
You ask about the discussion in the Companion concerning p.17 of Lang;
in particular, why the composite map G -> G/K -> (G/K)/(H/K) has
kernel H, noting that it is difficult to pictures the "cosets of
cosets" that comprise the latter group.
Well, I don't consider it particularly helpful to regard elements of
factor groups as cosets anyway. The important thing about a factor
group G/H is that it is a group given with a homomorphism of G
onto it which has kernel H. The construction by cosets is simply
the tool used in showing that such a group exists. Let us write
q_1 for the canonical map G -> G/K and q_2 for the canonical map
G/K -> (G/K)/(H/K). Then the kernel of the composite q_2 q_1 is
{x\in G | q_2 q_1(x) = e}. Since q_2 has kernel H/K, q_2 q_1(x) = e
if and only if q_1(x)\in H/K; but q_1(x)\in H/K <=> x\in H.
You also point out that in the next paragraph of the Companion, I refer
to G/H, while Lang does not assume H normal. You're right; I
hadn't realized that. Well, it is best to assume H normal when
motivating the argument. Then, to get the boxed equation under Lang's
more general assumption, apply the result so proved using N_H in
place of G. Since by definition, H is normal in N_H, this works.
You also ask why the kernel of the composite shown in the next-to-last
display on p.17 of Lang is H. Well, write down a statement of what
that kernel is, and compare with the definition of H a few lines
earlier! (Remember that, as I said above, it is not important to look
at G'/H' as consisting of coset; just regard it as a group with a
homomorphism of G' onto it that has kernel H'.)
----------------------------------------------------------------------
You ask why, in the proof of Proposition I.3.1 on p.18, X is normal
in G.
Lang says a few lines earlier, "... it suffices to prove that if G
is finite, abelian ...". So he is assuming G abelian; and in an
abelian group every subgroup is normal.
----------------------------------------------------------------------
You ask about the embedding of H_i/H_(i+1) in G_i/G_(i+1) near the
bottom of p.19.
This is a case of the second boxed isomorphism on p.17. Figure out
what the "H" and "K" of that isomorphism have to be in order to make
the left-hand side come out to H_i/H_(i+1), then see what the
right-hand side comes to for those values of "H" and "K", and you'll
see that it's a subgroup of G_i/G_(i+1); so the former embeds in
the latter. Also check out the discussion I have in the Companion
of the intuitive meaning of that boxed isomorphism, and see how it
applies in this case.
----------------------------------------------------------------------
You ask what Lang means, in on p.20, line 5, by "factors through
the .. group G/G^c."
He means that the given homomorphism f: G --> G' can be written as
a composite, G --> G/G^c --> G', where the map G --> G/G^c is the
canonical map from G to G/G^c. (I hope my discussion in class of the
construction G/G^c made it clear why this is true.) If we call that
canonical map q, and the second map above a, then we have f = aq,
a "factorization" of f.
(The fact that a group G/N, and in particular, G/G^c, is called
a "factor group" of G is a coincidence. The "factoring" of the map
refers to writing it as a composite, not to one of the groups involved
being a factor-group.)
----------------------------------------------------------------------
In connection with the concept of simple group (p.20) you ask whether
there exist infinite simple groups. Definitely, though it takes some
work to verify the examples. To describe the easiest one, consider the
group GL(n,K) of invertible n x n matrices over a field K (n > 1).
This itself is not simple, since the determinant function is a
homomorphism to the multiplicative group of K, and its kernel is a
proper nontrivial normal subgroup. So let us consider that kernel, a
group called SL(n,K). Even this may not be simple, because if K has
a nontrivial nth root of unity zeta, then the corresponding scalar
matrix zeta I has determinant 1, hence lies in this group, but is
clearly central, hence generates a normal subgroup. So one divides out
by the subgroup of all such elements, getting a factor-group called
PSL(n,K). This is a simple group in almost all cases (the only
exceptions are when n = 2 and K = Z/2Z or Z/3Z), but it takes some
tedious linear algebra to prove this. (So it isn't really appropriate
to give this example before we've covered linear algebra.)
There are also examples obtained by presentations by generators and
relations; but these can't be discussed until we have come to that
technique.
Oh, I guess there is an example that is easy to give at this point:
I assume you have seen the fact that the alternating groups A_n
are simple for n >_ 5, which we will prove soon. Well, one can
define the "infinite alternating group" as the set of permutations
sigma of the set N of natural numbers such that (i) sigma(i) = i
for almost all i, and (ii) if we let k be an integer such that all
the i with sigma(i) not-= i are _< k, then the restriction
of sigma to {1,...,k} is an even permutation. Then one can verify
that these elements form a group, and deduce from the fact that the
groups A_n are simple for large n that this group is also simple.
But this is a little less satisfying than the previous examples,
because it arises from finite simple groups, while those examples are
"essentially" infinite.
----------------------------------------------------------------------
You asked in your question for Monday about the meanings of "exponent"
referring to a group: "an exponent" as defined by Lang on p.23 vs.
"the exponent" meaning the least such value.
Both are used; they are important in different contexts. If one
wants to make a statement about all groups that satisfy x^6 = e,
one wants to be able to call these "the groups of exponent 6" and
not exclude those that in fact have exponent 2 or 3 (or the trivial
group with exponent 1) -- the consequences of the equation x^6 = 3
remain true on those cases. But if one is studying a particular group,
it is natural to look at the least exponent, and call it "the exponent
of G". So the wordings "groups of exponent n" vs. "the exponent
of G" carry this distinction. When there is real ambiguity, one
should say which one means.
----------------------------------------------------------------------
You ask about the proof of Prop. I.4.3(iii) (p.24). The key idea
lies in the paragraph preceding the proposition. In the context
of (iii), each of a, b determines a homomorphism Z --> G; as in
that paragraph, this induces an isomorphism between "G_1" and "G_2", in
this case, G and G. If you look at the details of how this is
constructed, you will see that it takes a to b.
----------------------------------------------------------------------
You asked about the isomorphism Z/mZ =~ G/f(mZ) in the second line of
the proof of Prop. I.4.3, p.25. To see this, identify G with Z/nZ
and f with the canonical map from Z to this factor group; then
apply the first boxed isomorphism of p.17 with Z for "G", nZ for
"K", and mZ for "H".
----------------------------------------------------------------------
Regarding Lang's Proposition I.4.3, proved on p.25, you ask
> In the proof of (v), how does surjectivity follow from the Chinese
> remainder theorem (which is a statement about ideals of a ring)?
In the set of integers, the ideals are the same as the additive
subgroups. (Anyway, the proof in the Companion gets around this
use of a not-yet-proved result.)
----------------------------------------------------------------------
Regarding the note in the Companion about the top paragraph on
p.26 of Lang, you ask:
> Does it ever come in handy to know that every group is isomorphic
> to a group of permutations? I have never been able to use that
> to solve a problem.
Well, if we didn't know this, it would certainly be of interest
to study the properties of the class of groups that could be so
represented, and figure out how they differ from other groups.
Since the underlying significance of the group concept is that it
describes the natural structure on the set of automorphism classes
of a mathematical object, we would want to know "What properties of
automorphisms have we missed in the definition of `group'?"
It's also extremely useful heuristically -- if we want to find a
group with a given property, we know that if we can find one, we can
find such a group described as some group of permutations of a set;
and that is often the best way to construct examples.
----------------------------------------------------------------------
You ask when the action of a group by conjugation on its subgroups is
faithful, and whether it can be transitive (concepts introduced on
p.28).
Your e-mail parenthetically asks whether there is some connection
between the kernel of that action and the center of the group, and you
should have been able to see that there is: The center is contained
in that kernel; so for the action to be faithful, the center must be
trivial. I am not sure whether the converse is true. It seems
very difficult for it to fail, since this would mean that every
inner automorphism induced by a nonidentity element would have to
move some elements, yet some such inner automorphism would have to
carry all subgroups into themselves. There is a group in which
certain inner automorphisms that move elements do carry all subgroups
into themselves, namely the 8-element "quaternion group"; but these
automorphisms modify all elements by members of the center of the
group, and I don't see how to find an example that doesn't have a
center. In summary: Having trivial center is a necessary condition
for the action to be faithful, and might also be sufficient but I
don't know.
On whether it can be transitive -- it has to send every subgroup to
a subgroup of the same order, so if one looks as the set of all
subgroups, including G and {e}, it certainly can't be transitive
unless G = {e}.
----------------------------------------------------------------------
You ask about the orbit decomposition formula (p.29).
The basic idea is that if a set is the union of a family of disjoint
subsets, then its cardinality is the sum of their cardinalities.
To describe the cardinality of an orbit in S, one has to pick
a point s of that orbit, and use Prop.I.5.1. On p.29, in the
display preceding the orbit decomposition formula and the surrounding
sentences, Lang sets up the notation for choosing a point from
each orbit; the formula is expressed using that notation. I hope
that with this in mind you can follow what he does. If not, write
again.
----------------------------------------------------------------------
You ask whether the definition of pi(sigma) at the bottom of p.30
should be f(x_sigma^-1(1), . . .,x_sigma^-1(n)) rather than the
formula Lang gives.
That is a question that bothered me for a long time; but as implied
in my comment on that formula in the Companion, it is correct. The
difference between this and the case of d(sigma^-1( ), sigma^-1( ))
(earlier on the same page of the Companion) is that in that case, the
function sigma was being applied to each argument of d, while here
sigma is being used to change the order of the arguments.
Of course, if one thinks of f has having for "argument" an n-tuple
of integers, then sigma is acting on the argument of f. But it
is acting in a way that reverses order of composition; so two reversals
combine to give a correct action.
----------------------------------------------------------------------
Your explanation for the argument at the top of p.31 seems to be
right. In summary: Lang has proved that if a quotient of subgroups
of the symmetric group is abelian, then the property of containing all
3-cycles carries over from the big subgroup (if it holds there) to the
smaller one. Then he considers a tower in which all steps are abelian,
and his top group, S_n, contains all the 3-cycles; so by the above
statement, that property of containing all the 3-cycles carries over,
inductively, to each member of the chain. So if one also assumes the
chain ends in {e}, one has a contradiction.
----------------------------------------------------------------------
You write, in relation to my discussion of semidirect products
(Companion, comment on Lang's p.33):
> Suppose N is a normal subgroup of G. I wonder if in general, we
> can write G as some product of N and G/N (direct or semidirect or
> something more general).
Well, as the example G = Z, N = nZ that I gave in class shows,
G may not contain a copy of G/N, so if "some product" implies some
group that contains G/N, the answer is no. But if one doesn't require
that, then yes. I'll sketch the construction as a generalization of
the semidirect product. Suppose we are given groups H and N, and
an action \psi: H -> Aut(N). Then we define a group with underlying
set H x N, and with multiplication of the form
(h_1, x_1) (h_2, x_2) = (h_1 h_2, c(h_1,h_2) x_1^{\psi(h_2)} x_2)
This is just like the definition of the semidirect product,
except for the term c(h_1,h_2), which is given by a function
c: HxH --> N which is required to satisfy ... exactly those identities
needed to make the above operation associative! Those identities are
called "the cocycle conditions", and c is called a "cocycle" for
H, N, and \psi. I think this construction is used mainly in the
case where N is abelian; otherwise the cocycle conditions are
not nice enough to make practical use of. (But I'm not a group
theorist, so I only have impressions of what they do.)
As a very simple example, consider again the case where G is the
additive group of Z, and N is the subgroup of multiples of a
positive integer n. Thus, we want to construct Z from H = Z/nZ
and N = nZ. Here \psi is trivial; to define c, let us write
each element of H as [i], the congruence class of i, where
i\in\{0,...,n-1\}. Then we define c([i],[j]) to be 0 if i+j < n,
and to be n\in nZ if i+j \geq n. Then you will find that pairs
([i],nk) compose just the way the integers nk+i add.
----------------------------------------------------------------------
You ask about the isomorphism between right and left semidirect products
(Companion, p.17, bottom, re Lang p.33).
Well, remember that these are based on thinking of a group with a
normal subgroup N and a subgroup H having trivial intersection
with N, such that NH = G, and the two ways of uniquely writing
any element of G, as nh or hn. A given element of G will have
an expression in each of these forms; if you write down a formula
relating one expression to another, and then use this as a formula
for mapping a pair (n,h) to a pair (h,n), it will give the
desired homomorphism.
Remember also that the two semidirect products will involve
homomorphisms "H --> Aut(N)" which have to be interpreted slightly
differently, depending on whether one regards these as written
on the right or on the left. To get the correspondence between the
two constructions, one has to make precise the relation between these
two versions of "Aut(N)".
----------------------------------------------------------------------
Sorry I didn't work Lemma I.6.1, p.33 into lecture; as usual, there is
not enough time for everything I would like to say. Anyway, the proof
of that Lemma consists of two parts: In the first sentence Lang states
an auxiliary result which he will establish; the proof of that lasts
until the display; when he has proved it, he uses it to prove the Lemma.
That second part is clarified by my note in the Companion.
It may seem at first that the auxiliary result is unrelated to the
Lemma; but notice that each of them relates orders of elements to
the order of the group. Of course, we know that the order of every
element divides the order of the group; these results are partial
converses -- they say, roughly, that the order of the group can't
have prime factors that don't come from orders of elements. So it's
not surprising that one of these statements yields the other.
----------------------------------------------------------------------
You ask about conditions under which, given a factorization of the
order of a finite group G into relatively prime factors r and s,
one can say that G must have a subgroup of order r -- as a
possible generalization of the existence of p-Sylow subgroups (p.34).
To say that r is a factor of the order of G such that the
complementary factor s is relatively prime to r is equivalent
to saying that for some set of primes \pi, r is the product of
the largest powers of the members of \pi occurring in the order
of G. In this situation, a subgroup of G having order r is
called a "Hall \pi-subgroup of G". I know that people have studied
the question of for which sets of primes \pi a group will have
a Hall \pi-subgroup, but I don't know what general results
have been found. For a negative example, if G = S_7 and \pi =
{5,7}, then a Hall \pi-subgroup of G would have order 35. If
one existed, this would mean that a group of order 35 had a faithful
action on a set of 7 elements. But as Lang notes in the next-to-last
example on p.36, every group of order 35 is abelian. It is not hard
to show that a faithful action of an abelian group of order 35 must
have at least one orbit of order divisible by 5 and least one of order
divisible by 7. Whether these are the same orbit (in which case, its
order must be divisible by 35) or different (in which case, their
orders must add up to at least 12), this is clearly impossible in a
set of 7 elements. (The same argument shows that S_5 has no Hall
{3,5}-subgroup; I used the above case just because Lang had explicitly
noted the statement about groups of order 35.)
Anyway, knowing the name, you should be able to look for further
references if you want.
----------------------------------------------------------------------
You ask how the statement about (H:H_s_i) being divisible by p
completes the proof of Lemma I.6.3(a) on p.34.
That statement about (H:H_s_i) concerns the summands in the
displayed equation that correspond to non-fixed points. Hence, modulo
p, we can drop all those summands, and conclude that #(S) is
congruent modulo p to the sum of the terms corresponding to
the fixed points. Each of those terms is (H:H) = 1 (since the
isotropy subgroup of a fixed point is the whole group), so #(S) is
congruent to the number of those summands.
----------------------------------------------------------------------
You ask about sentence beginning "Indeed" on p.35, just before the
first display.
Here Lang is using the observation numbered (iv) on p.17, about two
groups such that one is contained in the normalizer of the other;
in particular, the statement beginning "equally obviously".
("Obviously" meaning "it comes right out when you write down what
is needed".)
----------------------------------------------------------------------
You ask how the equality H = Q at the end of the proof of
Theorem I.6.4 (p.35) gives statement (ii) of the theorem.
Because the proof of statement (i) actually gives Q as one of
the conjugates of P. (See the first words of that final paragraph,
"Next, let S be the set of all conjugates of P ...".)
----------------------------------------------------------------------
You ask why the G_i are normal in the tower of Corollary I.6.6, p.35.
The Corollary is proved by induction, so the question comes down
to seeing why normality is preserved by the inductive construction.
(It is trivial in the base case, where n=0.)
In the proof, the inductive step takes a tower for G/H and
gets from it a tower for G. The inverse image of any normal
subgroup under a homomorphism (in this case, the homomorphism
q: G -> G/H) is normal, so that step preserves normality of these
subgroups. The one subgroup that appears at this step that is
not an inverse image under that homomorphism is the final step
{e} (as a subgroup of H = q^{-1}({e}); and of course, {e}
is always normal in G.
It is interesting to compare this result with the observation "every
finite group has a normal tower with simple factors", and see what
difference accounts for the fact that that does _not_ give a tower
in which each step is a normal subgroup.
----------------------------------------------------------------------
You ask about a name for a normal tower in which every step is
in fact normal in G, as in Corollary I.6.6, p.35.
I don't know of such a name. You mention "subnormal tower";
but if anything, I would expect that term to be used to refer
to what Lang calls a "normal tower" by someone who wants to
restrict the phrase "normal tower" to the case where all the
groups are normal in G, since a "subnormal subgroup" means
a subgroup which can occurs in what Lang calls a normal tower.
Incidentally, what Lang calls a normal tower is more often
called a normal series. Google shows "subnormal" used much less
commonly before "tower" or "series" than "normal"; so it doesn't
seem to be a common usage.
----------------------------------------------------------------------
You ask about the induction in the proof of Corollary I.6.6, p.35. When
Lang says "by induction", a more detailed statement would be "We may
assume inductively that the result is true for all groups of smaller
order; in particular, for G/H". Then he takes the inverse image, in
G, of the tower that the inductive assumption gives for G/H.
----------------------------------------------------------------------
In the proof of Lemma I.6.7, p.36, you ask about the phrase "the
representation of G on this orbit".
Think of "representation" as meaning "action".
The idea is that one thinks of each member of the group as being
"represented" by a certain permutation of the given set. The usage is
very common when one speaks about groups acting by linear automorphisms
of a vector space -- the study of this concept, in various forms, is
called the theory of group representations. But it is also sometimes
applied to actions by arbitrary permutations on sets.
----------------------------------------------------------------------
You ask about the case K = H in the proof of Lemma I.6.7, p.36.
K = H is what he is proving (by getting a contradiction in the
contrary case). Do you see how it gives the conclusion of the lemma?
----------------------------------------------------------------------
You ask what I mean in the 5th from last line of the paragraph after
You ask about Lang's use in the first Example on p.36 of the fact
that the automorphism group of a cyclic group of order 7 is a
cyclic group of order 6.
It's true that he should have given some justification; but what is
really needed, namely that the automorphism group has order 6, follows
easily from what he proved earlier. See the Companion, p.19, 8th
through 5th lines from bottom, sentence beginning "Now by ...". (Once
one has the facts noted there about that group, an easy computation
for q = 7 shows that it is cyclic, as he claims, though this is not
needed.)
----------------------------------------------------------------------
Lemma I.6.c3 in the Companion (in the section on structures of groups
of order pq, to go with p.36 of Lang) by mapping hn to "the
automorphism it induces".
I mean the automorphism given by conjugation by hn, restricted to N.
----------------------------------------------------------------------
You say that Lang's proof of his Lemma I.6.7 (p.36) doesn't seem to
give the statement "N < K" asserted in my Lemma I.6.c2(a).
You're right! My mistake was that I took for granted that the proof
that Lang would give of his lemma was the "natural" one; but he gives
a roundabout one, and that proof doesn't yield my assertion.
The "natural" proof of Lang's lemma is to look at the action of G on
G/K, rather than on the set of conjugates of K. Then (whether or not
K is normal) we see that the isotropy subgroup of K is K, so the
kernel of the action is a normal subgroup contained in K.
(To see how Lang's proof of Lemma I.6.7 can be completed from this
start, note that _if_ (G:K) = p = smallest prime dividing (G:1), then
(G:N), a divisor of p!, must equal p = (G:K), so K = N, so K
is normal.)
Thanks for pointing this out!
----------------------------------------------------------------------
Regarding direct sums of abelian groups (pp.36-37) you ask whether it
isn't possible to define them for non-abelian groups as well.
Well, there are two ways of thinking about a direct sum: As the
group having a certain universal property, and as the subgroup of
the direct product consisting of elements all but finitely many of
whose coordinates are the identity. Groups of each of these sorts can
be constructed in the non-abelian context, but, in contrast to the
situation for abelian groups, they are two very different groups. The
one with the universal property like that of the direct sum of abelian
groups is called the coproduct of the groups; we will see that
construction in this coming Monday's reading. The subgroup of the
direct product consisting of elements all but finitely many of whose
coordinates are the identity is called the "restricted direct product".
----------------------------------------------------------------------
You ask about the symbol "f|B" in the proof of Theorem I.7.1, p.41.
It means the restriction of f to B; i.e., the function which
has domain B instead of A, but on elements of that domain
acts exactly as f does. (Lang notes this notation on p.ix.)
----------------------------------------------------------------------
You ask (in connection with the "Remarks on split surjections, split
injections, and split exact sequences" following the discussion in
the Companion about the proof of Lemma I.7.2, p.41 of Lang), what I
mean by the short exact sequences "corresponding to" a given surjective
or injective map of abelian groups.
If f: A --> B is a surjective homomorphism of abelian groups, then
0 --> Ker(f) --> A --> B --> 0 is the corresponding short exact
sequence.
Likewise, if g: C --> A is injective, the corresponding short
exact sequence is 0 --> C --> A --> A/f(C) --> 0.
----------------------------------------------------------------------
You ask about the argument at the top of p.42 that Lang uses to show
that any two bases of a free abelian group have the same cardinality;
in particular, you ask why B/pB is a direct sum of m copies of a
cyclic group of order p.
The general observation one needs is that if an abelian group A is
written as a direct sum B (+) C, and if n is an integer, then A/nA
(where nA denotes {nx | x\in A}), can be identified with
B/nB (+) C/nC. Indeed, we get nA = nB + nC, and the isomorphism
(B (+) C) / (nB (+) nC) =~ B/nB (+) C/nC is not hard to verify.
The same is true for direct sums of more than two summands.
Now if A is a free abelian group of rank m, it is a direct sum
of m copies of Z, so A/pA is a direct sum of m copies of Z/pZ,
which is Lang's statement.
----------------------------------------------------------------------
You ask why, on p.43 line 5, we have urx \in A_s and vsx \in A_r; i.e.,
why surx = rvsx = 0.
Because by assumption, x\in A_m, and m = rs; and both of the
coefficients sur and rvs are divisible by rs.
----------------------------------------------------------------------
Regarding Lang's reference to the "residue class" \bar{x} of x
in the middle of p.44, you ask whether this just means the coset
it belongs to.
Right. The term "coset" is the more common one in group theory,
while "residue class" is more common in ring theory. (Different
traditions.) But Lang happens to use the latter term here in a
group-theoretic context.
----------------------------------------------------------------------
You ask about the choice of the symbols bold Z versus ordinary
(actually, italic) Z, as on p.46, bottom, in writing "Z_p" etc..
Well, boldface Z (or in recent decades, blackboard-bold Z) has
become standard for the integers, whether as a set, a group, or a ring.
(Historically, it comes from the initial letter of German "Zahl",
meaning "number".) So one can regard "Z_p" as short for "Z/pZ", and
so use boldface Z. But there are two difficulties. First,
number-theorists like to write Z_p for the ring of p-adic integers
(cf. Lang, last 3 lines of p.50 and first two lines of p.51, not in
this course's readings), which has a very different structure from
Z/pZ. (It is torsion-free, and uncountable.) Second, the use of "Z"
for cyclic groups actually comes from the German "zyklisch" meaning
"cyclic"; and it is convenient to use it even in the case where the
group is written additively rather than multiplicatively, and/or with
a generator denoted by some symbol other than "1". So these
considerations lead one to use non-boldface Z, and I find myself
pulled both ways. Even if one is inclined to use non-bold Z for
finite cyclic groups, it is very natural to use Z for the infinite
cyclic group when one identifies it with the additive group of the
integers.
----------------------------------------------------------------------
You ask whether the "\psi" on p.49, line 3 means \psi_{x'}.
Not quite: To make sense of what follows, one must understand
\psi to mean the map that takes each coset [x']\in A'/B' to
\psi_{x'}. This is well-defined by the preceding two lines;
and since for each x', he has made \psi_{x'} a homomorphism
A/B --> C, the map \psi so constructed will be
a homomorphism A'/B' --> Hom(A/B, C).
Thanks for pointing out this unexplained notation; I'll put a
note in the next version of the Companion.
----------------------------------------------------------------------
You ask about the notation "0 --> A'/B' --> Hom(A/B, C)" in the
second display on p.49.
Lang means "a one-to-one homomorphism A'/B' --> Hom(A/B, C)". He is
using the notation of exact sequences, introduced on p.15. If one
writes a sequence "0 --> X --> Y" of abelian groups, then since a
homomorphism from 0 is uniquely determined, it is probably not being
written there to focus attention on that map; rather, if one states
that the sequence is exact, this means that the kernel of the map
X --> Y is the image of the map 0 --> X; in other words, that the
kernel of the map X --> Y is zero, in other words, that the map
X --> Y is one-to-one. Since Lang has not been talking about exact
sequences, it is sloppy of him to use this notation here to express
one-oneness; but it is an easy habit to get into when one works in
areas in which exact sequences are commonly used.
Likewise, writing X --> Y --> 0 indicates a surjective map X --> Y.
----------------------------------------------------------------------
You ask whether in the definition of a category (p.53) Mor(A,B) can
be empty.
Yes. It is never empty when the category is that of groups or monoids,
because for any two groups or monoids A, B there is always the trivial
morphism taking all elements of A to the identity in B. In the
category of sets, Mor(A,B) is empty if and only if B is the empty
set and A isn't. Soon, when we study rings, which we will require
to have identity element 1 and homomorphisms carrying 1 to 1,
we shall see that Mor(A,B) is empty in many more cases, e.g.,
when A is the ring Z_n and B = Z.
----------------------------------------------------------------------
Concerning Lang's statement on the 3rd and 4th lines from the
bottom of p.53, that most of our morphisms are actually mappings
or closely related to mappings, you ask what the distinction is
between a morphism and a mapping, and for examples where morphisms
are not mappings, in particular where they are "closely related to
mappings".
By a mapping, Lang means a function. The morphisms in a category, on
the other hand, are simply whatever elements form the sets Mor(A,B).
For an example where morphisms have nothing to do with set-maps,
let \Gamma be any graph (I hope you've seen the concept; basically,
a diagram consisting of some dots called "vertices", and "edges"
connecting some of the vertices.) Define the category C_\Gamma to
have for objects the vertices of \Gamma, and for two such vertices
x and y, let a morphism x -> y mean a "path" from x to y in
\Gamma, i.e., a sequence of consecutive edges starting and x and
ending at y. For each x, we consider an "empty sequence of edges"
to form a path from x to x, which we call the identity morphism of
x. One defines the composite of a path f from x to y with a
path g from y to z to be the path from x to z gotten by
laying f and g end-to-end. One finds that C_\Gamma satisfies the
axioms of a category; but the morphisms are not in any sense mappings
from one set to another.
For an example where they are "close to mappings", see the paragraph
about the category "Rel" beginning near the bottom of p.33 of the
Companion. If one looks at relations as "multi-valued functions"
(where for each x\in X, the elements y\in Y such that (x,y)\in R
are considered "the values of R at x"), then these relations are
conceptually "close to mappings". For more examples, including some
where the morphisms are "closer" to maps than in that one, see my
Math 245 notes, section 6.2.
----------------------------------------------------------------------
Regarding the analogy between the concept of an abstract category
and that of an abstract group, noted in the material in the Companion
referring to Lang's p.53, you ask whether there is an analog of
Cayley's Theorem for categories.
The answer is "Yes, but ... ". If one assumes a set-theory such as
I sketch on p.34 of the Companion, then any category C which
is "small" with respect to a given universe will have a concretization
-- a faithful functor to the category of sets -- in that universe.
But note that a category such as the category of groups in a given
universe is not itself small with respect to that universe; and for a
general category which, like that one, merely has objects and
morphism-sets all lying in a given universe, one can't necessarily find
a concretization by sets in that universe -- though one can by sets in
any larger universe.
Sorry if this sounds confusing. The right way to come to it is by first
figuring out how the proof should work, then noting how the sets that
one constructs are related to the set-theoretic properties of the
original category. In my Math 245 notes, I sketch the idea of the proof
starting at the bottom of p.151 (ignore the blank p.152 if you're
looking at it online, and continue on p.153), before I have introduced
"universes". Then section 6.4 introduces universes, section 6.5
defines "functor", and Theorem 6.5.6 gives Cayley's Theorem, as the
statement that every small category (category which is a member of
one's chosen universe) admits a concretization (a faithful functor
into the category of sets in that universe).
----------------------------------------------------------------------
You ask what the difficulty is with categories having proper classes
of objects, alluded to at the bottom of p.31 of the Companion (re
Lang, p.54).
Well, to start with, although Lang introduces the concept of monoid
(from which he gets that of group) as a set "with" a law of composition,
and likewise now says that a category "consists of a collection of
objects ... and for two objects ... a set Mor(A,B) ...", the the way
to make such things precise is to define a monoid as an ordered pair
consisting of the set and the operation (or better, an ordered 3-tuple
whose third member is the identity element, but I won't go into the
reason here), subject to the appropriate conditions, and similarly to
define a category A as (at least) an ordered 3-tuple, with first
member Ob(A), second member (Mor(X,Y))_X,Y\in Ob(A)) and third member
specifying the composition operation. But if a tuple is defined as
a certain sort of function, one can't define a tuple whose entries are
proper classes.
Given Lang's definition of category, you say the only reason you can
see for wanting categories to be sets "is if you want to perform weird
things like form categories of categories". Well, one does want to
look at such things; but without trying to convince you of this, I can
surely point out that one might want to use a set which contains two or
three or countably many categories.
If you look at exercises I.12:3-I.12:4, you will see the concept of
a "variety of groups". It is not hard to show that the set of all
varieties of groups "forms a lattice" -- except that it isn't a set.
Varieties of groups can be studied even without category theory, so
the need to get around this problem is not just a consequence of the
category-theoretic viewpoint.
Anyway, the solution is very elegant; I recommend section 6.4 of the
Math 245 notes. An amusing feature of the situation is the reason I
indicate there for proposing the axiom that every set is contained in
a universe, rather than the weaker axiom that there is at least one
universe: The latter approach tacitly creates one realm (the inside
of the universe) within which "ordinary mathematics" is done, and a
grander realm in which categorists work; the former creates a situation
in which any sort of mathematics can be considered to be done in any
universe, and any consideration that looks at it globally can be done
in the next larger universe. So the justification of the stronger axiom
is to avoid setting up a mathematics in which categorists would have
an "elite" role!
----------------------------------------------------------------------
You ask what the morphisms are in the category used in the definition
of the Grothendieck group (p.58).
If f is a homomorphism M --> A, and g a homomorphism M --> B,
then a morphism from f to g in that cagetory is a homomorphism
h: A -> B that makes a commuting diagram with f and g, i.e., such
that g = hf.
The way to see that this is what Lang intends is to read carefully the
preceding paragraph, where he states precisely what he means by a
morphism in the category of abelian groups with set-maps of S into
them. Then it is reasonably safe to assume that if he doesn't say
what he means the morphisms to be in this new category, it is because
the situation is completely analogous.
Conceptually, I recommend describing this auxiliary category as having
for objects "abelian groups with homomorphisms of the monoid M into
them". I.e., where Lang calls the homomorphism f: M --> A the
object, I would call the pair (A, f) the object. This makes the
definition of a morphism (A,f) --> (B,g) more intuitively natural --
it is a homomorphism between these groups which respect the "additional
structure" on the groups, namely, the maps of M into them.
----------------------------------------------------------------------
Concerning the fact that Lang says near the bottom of p.61 that
in the fiber product shown earlier on that page, one calls p_1 the
"pullback" of g by f, you ask in what way p_1 is related to g.
I suggest you think about the form that the fiber product diagram
takes when C is the category of sets -- this is noted in the comment
on that page in the Companion -- and verify for yourself that in
that case, if g is 1-1 then so is p_1, and that if g is
surjective then so is p_1. These two facts don't themselves prove
that p_1 is naturally associated to g; but I think that the
understanding that proving those two facts will give you should
convince you that this is so.
----------------------------------------------------------------------
You ask about Lang's use of the word "rule" in defining "functor"
on p.62.
For "rule" read "function". I think he is avoiding the word "function"
because a function is supposed to have a _set_ as domain, and he is
defining a category to be a "collection" which may be too big to form a
set. The discussion in the Companion starting near the bottom of p.31
tells how to get around that dilemma; thus functors can indeed be
considered to consist of functions (one function on objects and one
one morphisms).
----------------------------------------------------------------------
(I guess this question was suggested by the term "natural
transformation" on p.65 of Lang.)
> I'm wondering if the terms "canonical" and "natural" have some
> well-defined meaning in category theory, e.g. a construction is
> canonical if all other similar functors factor through its functor.
The best definition I can give of "canonical" is the one in the
comment in the Companion to p.14 of Lang: "A canonical object means
one determined in a special way by the data being considered." It
would be a pity if someone gave the term a technical meaning, even if
that meaning matched the above sense in a large class of cases, because
we need words for the "meta-discussion" of mathematics, and as more
and more of these words are given technical meanings (e.g., "simple",
"natural" etc.) it becomes harder and harder to talk about mathematical
concepts, as distinct from making formal mathematical statements.
As I said in class, "natural" came to be used to mean "satisfying the
conditions for a morphism of functors", and many people now use the
word "natural transformation" to mean morphism of functors. But for
the reason indicated above, I strongly favor just calling a morphism
of functors a morphism of functors, and I'm glad Lang does so.
----------------------------------------------------------------------
Regarding Lang's introduction to category theory (pp.53-65), you ask
> What is category theory actually useful for, beyond adding notation
> which is uniform across mathematics? ... in section 11, Lang doesn't
> prove any big theorems. Are they simply coming later? ...
We have seen one general result: Though Lang gave the statement that
"universal repelling" and "universal attracting" objects are unique up
to unique isomorphism in passing, it is actually a powerful tool --
easy to see in the abstract, but not necessarily in a specific
situation where the conditions for which an object is universal are
complicated. And I mentioned today that the construction of free
groups (and the same applies to coproducts) as subobjects of big direct
products worked in a very general category-theoretic context.
The choice Lang makes is to introduce just some basic concepts of
category-theory, and present them as a language in which to formulate
some specific results about groups. Getting into the technicalities
needed to present general category-theoretic results would take us too
far afield from the main topics of his text, and math 250A. In my 245
notes, I make a somewhat different choice: I develop a large number
of examples of universal constructions in algebra emphasizing the
parallelisms but not introducing the concept of category, and then,
with all these on hand for motivation, I define category, and develop
the theory that unifies these results and provides common proofs for
the results shown separately.
Some of my exercises in the Companion for section I.11 of Lang
do obtain general category-theoretic results, though not real biggies:
I.11:1(d), I.11:4(b), and I.11:5.
----------------------------------------------------------------------
You ask whether the description of the free abelian group on M as
a group of maps from M to the integers has an analog for the free
group on M (p.66).
Not an obvious one. In the case of the free abelian group, we were
able to associate to different elements of M functions with
disjoint supports (where the "support" of a function means the set
of points where it is nonzero). Group-valued functions with disjoint
supports clearly commute with each other (because evaluating two such
functions at any point, one of them is 1, and 1 commutes with any
other element); hence using such functions, we cannot generate any
noncommutative group. Thus any noncommutative analog of the description
of the free abelian group cannot have the property that the function
associated to each x\in M has support {x}.
If we abandon that assumption, then it is no longer natural to identify
the domains of our functions with M; so we should look for a
description a free group on M as a group of functions on _some_ set
X, with values in some not-necessarily-commutative groups. This is
essentially Lang's proof of Lemma I.12.2, the set being {(i,phi)}
and the groups being G_{i,phi}. But in that form, it doesn't look much
like the description of the free abelian group.
----------------------------------------------------------------------
Regarding the construction in the first display on p.67, you ask
why we can't we let the index set I be "the set of isomorphism classes
of all groups generated by S", and each G_i a representative of the
class i.
The problem is that these isomorphism classes don't form a set --
each one of them fails to be a set, because elements of a group can
be taken to be arbitrary objects in our set theory; so that if we
had a set of all groups of a given order, then the set of their
identity elements (for instance) would be the set of all sets, which
we know leads to paradoxes. Of course, ZFC doesn't say "you aren't
allowed to perform a construction if one can show that it would lead
to a paradox"; it simply avoids paradoxes by not giving us the tools to
create a "set of all" anythings, except for things built from specific
previous sets. So in this case, the equivalence classes you want are
not sets, and we can't talk about them as such. However, groups whose
underlying set is contained in T do form a set, and we can construct
them all.
If we took one member of each isomorphism class within the set of
groups with underlying set contained in T, given with maps of S
into them which generated them, we would have what you wanted: a set
of groups with maps of S into them such that every group generated
by the image of a map of S into it was isomorphic to one and only
one of them. But since to prove that this exists, we have to start with
the whole set of groups with underlying set contained in T, it would
just add to the length of our proof to pare this set down by throwing
away all but one member of each isomorphism class.
----------------------------------------------------------------------
You ask how to prove Lang's assertion that the group presented as in
the second display on p.69,
.
is trivial.
Let's re-write the three relations defining this group in several ways:
(1a) xyx^{-1}y^{-1} = y (1b) xy = y^2 x (1c) xyx^{-1} = y^2
(1d) yx^{-1} = x^{-1}y^2
(2a) yzy^{-1}z^{-1} = z (2b) yz = z^2 y
(3a) zxz^{-1}x^{-1} = x (3b) zx = x^2 z (3c) xzx^{-1} = x^{-1}z
(3d) zx^{-1} = x^{-2}y^2
Here (3c) is gotten by applying x^{-1}(...)x^{-1} to both sides
of (3b), and interchanging the two sides. The point of (1c) and (3c)
is to express the relations in question as describing the action
of conjugation by x on the two other generators; the point of (1d)
and (3d) is to express these same relations as rules for moving "x^{-1}"
to the left past y or z, modifying the latter appropriately.
If we now conjugate (2b) by x, then by (1c) and (3c) the result is
(4) (y^2)(x^{-1}z) = (x^{-1}z)^2 y^2.
We now apply (1d) and (3d) to bring all occurrences of x^{-1} to
the left on each side of (4), getting
(5) x^{-1} y^4 z = x^{-3} z^2 y.
Applying (2b) (in reverse) to the right-hand side of (5), and then
applying x^3(...)(yz)^{-1} to both sides, we get
(6) x^2 y^3 = e.
Now we can conjugate (6) by x using (1c), to get
(7) x^2 y^6 = e.
Comparing (6) and (7) we get
(8) y^3 = e,
hence (6) shows x^2 = e, hence (3b) gives zx = z, or
(9) x = e.
One can now get y = e and z = e either by calling on the symmetry
of the original system of equations, or by substituting (9) into (1a),
and substituting the resulting equation y=e into (2a).
- - -
Incidentally, it is known that if one goes one step further, the
presentation
G = .
gives a very nontrivial group. The idea is to first look at the groups
G_1 = ,
G_2 = ,
and show that in each of them, the subgroup generated by x and z
is free on those two generators. The group G is then what Lang would
call the "fibered coproduct of G_1 and G_2 over the free group
on {x,z}", which group theorists would call the "free product of G_1
and G_2 with amalgamation of" that common subgroup, and still others
would call the "pushout of the diagram formed by" those three groups.
Anyway, the general structure theorem for "free products with
amalgamation" shows that G_1 and G_2 are both embedded in G. On
the other hand, any homomorphic image of G in which one of x,y,z,w
has finite order is the trivial group: Given, say, an equation x^n=e,
one deduces successively equations of that form for y, z, and w with
different exponents, and finally, another such equation for x with a
different exponent, which together with the given equation implies x=e.
(By the same principle, as soon as I got (9) above, I knew I was "home
free".) So this is an example of a finitely generated nontrivial group
with no finite nontrivial homomorphic images.
----------------------------------------------------------------------
You ask what Lang means on p.70, two lines above first display, by
"the free group with generators u(b), w and relations SL1 - SL4".
He means the group presented by those generators and relations. Since
the idea of "free" is "not satisfying any relations other than those
that have to be satisfied", it is sometimes colloquially used to
describe any universal algebraic object. However, since it has been
given the specific meaning "not satisfying any relations other than
those implied by the identities", and one has the distinct term
"presented by the indicated generators and relations" for what Lang
means here, he ought to use that term.
----------------------------------------------------------------------
You ask, regarding Proposition I.12.3, p.70, whether there is a general
schema to show that coproducts exist in a given category.
Well, categories are not all alike, and coproducts don't exist in all
of them! But in a large class of "naturally occurring" categories,
exactly the method Lang uses here allows one to prove that any
family (X_i)_{i\in I} of objects has a coproduct: One finds a
set of pairs (object, family of maps of the X_i into it) which
is general enough to approximate "all" such pairs; takes the
direct product of these objects, and uses "the subobject generated
by the images of the given maps". Of course, a general category
doesn't have such concepts as "subobject", but in the naturally
occurring categories for which this method works (groups, monoids,
rings, etc.) one does.
On the other hand, the description of the _structure_ of the coproduct
of a family in Proposition I.12.4-5 is special to groups.
----------------------------------------------------------------------
You ask about the cardinality estimate in the middle of p.71;
specifically, why a countable union of sets each having the same
cardinality as S also has the same cardinality as S.
By Theorem A2.3.3, p.887. (Taking the direct product with D is
equivalent to taking the union of countably many copies.)
----------------------------------------------------------------------
You ask about the statement on p.71, below the middle display, "we
may assume without loss of generality that G = S_gamma for some
gamma, and that g = psi for some psi \in Phi_gamma".
Whenever an author says "We can assume without loss of generality
that X is the case", this means "If we know the result true when
X is the case, then we can prove the general case from this." So
the present situation, assume provisionally that whenever
G = S_gamma and g = psi \in Phi_gamma, there exists g* as in
the preceding display. To prove the general case from this, remember
that Gamma and the sets Phi_gamma were constructed to give,
up to isomorphism, "all" groups of order card(S) and all families
of homomorphisms of the G_i into them. So given an arbitrary G with
card(G) = card(S), we can a find an isomorphism alpha from
G to some S_gamma, and by composing alpha with the family
of maps g, we must get some phi\in Phi_gamma. Now the assumed
result for S_gamma and phi gives us a map g_*: F_0 --> S_gamma
with the indicated property; composing this with alpha^-1, we get the
desired map F_0 --> G.
----------------------------------------------------------------------
> On page 71, Lang mentions in the example that G_2\coprod G_3 is the
> group generated by two elements S,T with relations S^2=1,(ST)^3=1.
> How does one show this?
I guess by using the "several properties of S, T" proved in the book
quoted on p.72 (sentence after second display). I'll add to future
versions of the Companion a note to skip this example (or at least
to realize that one does not have enough information to prove the
assertions).
A method of proving that certain elements of SL(2,Z) generate free
subgroups is indicated in Exercise 2.4:5 (p.34) of my Math 245 notes.
But it doesn't go as far as getting the structure of the whole group.
----------------------------------------------------------------------
Regarding the assumption that rings must have unit (p.83), and my
comment in the Companion (p.43) that there is a trick which reduces
the study of rings without unit to that of rings with unit
> I'm curious about the trick for studying rings without units.
> ... The reason I ask is that in manifold theory, the ring of
> C^\infty functions with compact support ... has no unit ...
> ... Is the idea somehow to just think of rings without units
> as ideals of larger rings with unit?
In a way, yes. Given any nonunital ring A, one can make the
abelian group A' = Z (+) A into a ring, by taking the element 1\in Z
to behave as the unit, using the given multiplication on A to
define the product of two elements of A, and using distributivity
to and two facts to get a multiplication on all of A', namely
(m+a)(n+b) = mn + (na + mb + ab). In this situation, we see that
A will form an ideal of A'. If one's rings are algebras over a field,
as in the situation you referred to, it is more natural to use that
field instead of Z. So if A is the ring of C^\infty functions with
compact support, then A' will be the ring of C^\infty functions which
are constant off a compact set. (This needs a bit of qualification in
the case where the manifold is itself compact.)
For an example of the nice properties of this construction, note that
if a is an element of a nonunital ring A, the left ideal of A
generated by a does not have the form Aa, but Za + aA. But if we
regard A as lying within A', this is simply A' a.
The ring A' constructed above has a natural homomorphism to Z,
given by m + a |-> m. In fact, the category of unital rings
"over Z" in the sense of Lang, p.61, is essentially the same as
the category of nonunital rings; and that is what I feel justifies
regarding the study of nonunital rings as subsumed by that of unital
rings. (Again, the same applies to algebras, mutatis mutandus.)
----------------------------------------------------------------------
In connection with the definition of ring (p.83) you asked whether
there were cases where one might want to consider "rings with a
non-commutative addition operation".
If one left commutativity of addition out of the definition, it would
still follow from the distributive laws. Namely, for any two elements
a and b we can simplify (a+b)(1+1) in two ways: by using
left distributivity and then right distributivity, or the other way
around. Equating the results gives a+b+a+b = a+a+b+b. Cancelling
the common term a on the left and the common term b on the right,
we get b+a = a+b.
There are variants of the concept of ring where this argument won't
prove complete commutativity. E.g., if the ring doesn't have unit,
a similar computation will prove that any two elements that are
products commute with each other under addition, but not arbitrary
elements. And if one only assumes distributivity on one side, one can
have more widespread noncommutativity. There are concepts such as
"near-ring" and "half-ring" embodying such weakened assumptions; they
are studied by a small number of people, of which I am not one.
----------------------------------------------------------------------
Regarding the definition of principal ideal ring, p.86,
> Are there examples of principal ideal rings which are not
> principal ideal domains?
Yes. It is easy to check that any homomorphic image of a principal
ideal ring is a principal ideal ring. Hence for any n, Z/nZ, being
a homomorphic image of Z, is a principal ideal ring; but for
n > 0 not prime, it will not be an integral domain.
----------------------------------------------------------------------
On p.87,
> Lang says "If a, b are left ideals of A, then a+b (the sum being taken
> as additive subgroup of A) is obviously a left ideal." What does he
> mean by "the sum being taken as additive subgroup of A" ?
He means {r+s | r\in a, s\in b}. Cf. last paragraph on p.37.
Trying to answer this question sent me looking to see where Lang first
introduces that notation. Such notation is introduced in
_multiplicative_ form on p.6, in the paragraph before the middle
of the page, and uses it commonly in sections I.2-I.3. As far as
I can see, he first uses it in additive form in the on p.37 at the
point mentioned above, taking for granted the transition from
multiplicative to additive notation.
----------------------------------------------------------------------
You asked whether homomorphisms whose kernels are prime ideals (p.92)
in some way "generate" all homomorphisms.
Basically, the answer is "no", but there are certain positive statements
one can make. As you know, the monoid of all ideals of the ring of
integers is generated under multiplication by the set of prime ideals.
More generally, this is true in all principal ideal domains. The two
questions one can ask are "Does this fact lead to corresponding
statements about _homomorphisms_ having these ideals as kernels?" and
"Does this fact generalize to (some or all) rings that are not principal
ideal domains?" To the former question, the answer is "Yes in the
context of module homomorphisms, though not in the context of ring
homomorphisms"; to the latter, the answer is "Yes, with a weakened
conclusion, for certain classes of rings." If you want more details,
you can ask me in office hours.
----------------------------------------------------------------------
You ask why the first line on p.93 shows that y\in \frak m.
(I'm using "\frak" as in TeX notation for "fraktur font".)
Interestingly, this is the same point that stumps many 113 students
in the proof of Euclid's Lemma, which is really a special case of
this result.
To see why the right-hand side is in \frak m, look at the preceding
steps of this proof and see what elements we already know to be
in \frak m, with special attention to terms similar to those in
this expression. By assumption xy\in \frak m, and behold, there is
an xy dividing one summand in the right-hand side of the line in
question; so you just have to look at the other summand, yu. And
the preceding line of the proof says "u \in \frak m", so that's in
\frak m too. So the sum in in \frak m.
----------------------------------------------------------------------
You ask about the statement on p.94 that "given an integer n > 1, the
units in the ring Z/nZ consist of those residue classes mod nZ which
are represented by integers m \neq 0 and prime to n."
One way to think of this is that for such an m, multiplication by
m doesn't take any number not divisible by n to any number that
is divisible by m. This is clear if we know unique factorization
of integers; and it follows that multiplication by m will be 1-1
on Z/nZ, hence since that set is finite, it will be invertible,
so the residue class of m is a unit. If we don't assume familiarity
with the unique factorization property of the integers (which will
be deduced in a later section from general results about certain kinds
of rings), we can, as you say, use the Euclidean algorithm to get
get um + vn = 1, so um is congruent to 1 mod n, so the residue
of m is a unit.
----------------------------------------------------------------------
You ask about the statement on p.94 that "given an integer n > 1, the
units in the ring Z/nZ consist of those residue classes mod nZ which
are represented by integers m \neq 0 and prime to n."
One way to think of this is that for such an m, multiplication by
m doesn't take any number not divisible by n to any number that
is divisible by m. This is clear if we know unique factorization
of integers; and it follows that multiplication by m will be 1-1
on Z/nZ, hence since Z/nZ is finite, it will be invertible,
so the residue class of m is a unit. The proof of the unique
factorization property of the integers (which will actually be given
in a later section for a large class kinds of rings), is based on
the Euclidean algorithm, and as you also mention, this can be used
directly to get um + vn = 1, making um is congruent to 1 mod n,
and so proving the residue of m a unit.
----------------------------------------------------------------------
You ask about the origin of the term "Chinese Remainder Theorem" (p.94).
That result for the ring of integers was known to Chinese mathematicians
several centuries back. My understanding is that they were concerned
with chronological cycles, and whether various combinations of
positions in such cycles would occur. In particular, Chinese culture
has a 10-year cycle and a 12-year cycle; each point in each cycle has
a name; the points in the 12-year cycle are associated with animals
and give the "year of the dog" etc. that we hear about every Chinese
New Years. I haven't heard of the items in the 10-year cycle having
any such significance, but it is equally important in specifying years.
Since 10 and 12 are not relatively prime, the version of the Chinese
Remainder Theorem in Lang is not applicable to that case; but for any
two ideals I and J of a ring R and elements x\in R/I, y\in R/J,
one can show that there is an element of R belonging to the
intersection of x and y if and only if x and y have the same
images in R/(I+J). Thus, a member of Z/10Z and a member of Z/12Z
can be realized by a common element of Z if and only if they have
the same image in Z/2Z. Hence a point in the 10-year cycle and a
point in the 12-year cycle will be reached together at some time if
and only if either they both have odd positions in their cycles, or
both have even positions.
Probably the background of the theorem includes more complicated
cycles as well. For instance, since they have a lunar month, they
have to alternate between 12-month years and 13-month years, and the
cycle of these takes 19 years. (The same applies to the traditional
Jewish calendar.) But I don't know more details.
----------------------------------------------------------------------
You ask about Lang's statement on p.97, beginning of section II.3,
that "there are polynomials over a finite field which cannot be
identified with polynomial functions in that field."
If p is any prime, then over the field k = Z/pZ (the field of p
elements), the polynomial f(x) = x(x-1)(x-2)...(x-(p-1)) has the
property that for all a \in k (i.e., for each of the values a = 0,
1, 2, ..., p-1), f(a) = 0. So the function gotten by evaluating
f at elements of k is the zero function, although f and 0 are
different polynomials.
----------------------------------------------------------------------
You ask whether left adjoints (Companion, p.52, one of the items
to go after end of section on p.107 of Lang) are unique.
Indeed they are -- up to isomorphism, of course. Lang observed that
"universally repelling objects" are unique up to unique isomorphism,
and that free groups etc. could be considered universally repelling
objects in certain auxiliary categories. The same applies to the
objects F(X) which fit together to form the left adjoint of any
functor U that has one -- each of them corresponds to an initial
(i.e., universally repelling) object in an appropriate auxiliary
category, and so is unique up to a canonical isomorphism; and since the
morphisms that join the objects F(X) into a functor F are also
determined uniquely (by instances of the universal properties of the
separate objects), the functor F (if it exists) is unique up to a
canonical isomorphism.
----------------------------------------------------------------------
You ask why mathematicians consider localization of commutative
rings (Lang, pp.107-111). It's hard to know how to answer such a
question; for me the first answer is "Because it is interesting". But
I'll give you a couple more examples of how localizations arise, that
might make sense to you.
First, consider a polynomial ring K[X] (K a field) as a subring of the
ring K[[X]] of formal power series a_0 X^0 + ... + a_n X^n + ...
(not defined or studied in 250A). Many polynomials that are not
invertible in K[X] become invertible in K[[X]]; e.g., 1-X has
the inverse 1+X+X^2+...+X^n+... . So one can consider within
K[[X]] the subring generated by the elements f(x) g(x)^{-1} such
that f(x) is a polynomial, and g(x) is a polynomial that is
invertible as a formal power series. This will be the localization
of K[X] at the set of polynomials of the latter sort, namely, the
polynomials that do not belong to the ideal (X); i.e., it will
be K[X]_{(X)}.
Second, consider a ring Z/pZ (p a prime). Not only can every
integer a be mapped to an element [a] of this ring; given any
fraction a/b such that b is not divisible by p, we can map
a/b to [a][b]^{-1}. (E.g., for p = 5, we can map 2/3 to
[2] . [3]^{-1} = [2] . [2] = [4].) The set of rational numbers
for which we can do this forms the localization of Z at (p),
written Z_{(p)}, and the standard homomorphism Z --> Z/pZ extends
to a homomorphism Z_{(p)} --> Z/pZ.
----------------------------------------------------------------------
Regarding the verification of the statement in Lang that the
map h of the next-to-last display on p.109 is a homomorphism,
you write "I don't see how I can justify commuting f(s)^(-1) with
f(a')f(s')^(-1)".
In this section, "ring" means "commutative ring" unless the contrary
is stated; so in defining the category C at the top of the page,
B is assumed commutative!
However, though I don't know whether Lang thought about this, everything
he says on this page remains true of the objects B, B' etc. are
allowed to be noncommutative rings, as long as A remains commutative.
The key fact is:
| Lemma. If x and y are elements of a monoid M, and commute with
| each other, and if y is invertible in M, then x also commutes
| with y^{-1}.
Proof. x y^{-1} = (y^{-1} y) x y^{-1} = y^{-1} (y x) y^{-1} =
y^{-1} (x y) y^{-1} = y^{-1} x (y y^{-1}) = y^{-1} x. []
Note that in the above situation, if x is also invertible, then
x^{-1} will similarly commute with y, and by a second application
of the same principle, it will commute with y^{-1}. So the step
you ask about can indeed be justified without assuming B commutative.
----------------------------------------------------------------------
You ask whether, on the second line of p.56 of the Companion (material
on p.109 of Lang), "h(s^{-1})" should be "h(s^{-1} a)".
Right. Thanks!
----------------------------------------------------------------------
Regarding the construction of localizing a commutative ring at a
prime ideal (Lang, p.110) you ask
> Can all local rings be thought of as arising from localizing using
> the complement of a prime ideal?
Well, if you take any local ring R, and localize it using the
complement of its own maximal ideal, the result will again be R
(since the complement of the maximal ideal of R is exactly the
set of invertible elements). So every local ring arises in that
way by localizing itself.
Perhaps what you really meant was "Does every local ring that
comes up naturally in algebra arise by localizing some naturally
arising non-local ring using the complement of a prime ideal?"
The answer is no. An example is the ring K[[X]] of formal power
series a_0 X^0 + ... + a_n X^n + ... over a field K. This is
not defined or studied in 250A; but see Lang, section IV.9.
Another sort of example that doesn't arise by localization is the
field Z/pZ, or the ring Z/p^n Z (p a prime, n a positive
integer).
----------------------------------------------------------------------
Concerning the discussion of specializations of fields in the Companion
(p.57, material concerning p.111 of Lang), you ask what I mean by
"minimal domain" in the first sentence of the second paragraph.
As noted in the first paragraph, a specialization \phi on a field E
is a map which is not defined on all of E -- its domain is a certain
subring E_\phi of E. So one can make the set of specializations
from E to K a partially ordered set by writing \phi\leq\psi
if E_\phi\subseteq E_\psi, and \phi is the restriction of
\psi to E_\phi. Looking at all specializations E --> K whose
domains contain a certain set, one can look at minimal members of this
partially ordered set under the above ordering.
----------------------------------------------------------------------
> Lang defines left module on page 117. Shouldnt he include the
> condition (ab)x=a(bx) as well?
When he speaks of "an operation of A on M (viewing A as a
multiplicative monoid)", this is understood to entail that equation;
see p.26, line 5. (On p.26, G is restricted to be a group, so Lang
is sloppy in taking for granted without having said it that the same
conditions define operations of general monoids.)
----------------------------------------------------------------------
Many of your questions are based on the assumption that the word
"algebra" does not presume associativity or the existence of unit.
Lang's discussion of the topic on p.121 gives that impression, but
as I say in the first sentence of p.64 of the Companion, "Lang's
introduction to this concept is misleading"; and I make clear that the
use of the word "algebra" does not imply absence of the associativity
and unitality conditions. (One can consider, as we do in this course,
associative unital rings; in other contexts one considers nonassociative
and/or nonunital rings. The same applies to algebras.) And Lang
himself, in the last sentence of the middle paragraph of p.121, cancels
the impression given by what he says up to that point by writing "But
in this book, unless otherwise specified, we shall assume that our
algebras are associative and have a unit element."
So -- if you want to know the answers to some questions about
nonassociative and/or nonunital rings or algebras, you can ask them;
but don't assume that that is what Lang or I mean in what we write
about algebras.
----------------------------------------------------------------------
Regarding a comment in the Companion about p.129 of Lang, you ask
> What is an overring?
If R is a subring of S, then S is called an overring of R.
----------------------------------------------------------------------
You ask whether in Corollary III.4.3, p.135, the modules are over the
same ring.
Yes. Isomorphism of modules is only defined when they are over the
same ring, so unless some wording is added to imply a nonstandard sense
of "isomorphism", one can presume when "isomorphism" is mentioned that
the modules are over the same ring. (Also, when modules or vector
spaces are mentioned without specification of the ring or field, one
can generally assume that there is some fixed ring or field in the
background, which all modules or vector spaces mentioned are over.)
----------------------------------------------------------------------
You ask about my notes in the first long paragraph of p.72 of the
Companion (in the material on Lang's p.138), about getting K_0.
It follows from the characterization of K_0 in the last paragraph
on p.138 that K_0(R) can be gotten from K(R) by dividing out by
the subgroup generated by the elements representing the free modules.
(It takes some thought to verify this formally, but it is not hard.)
Is it clear now why the statements in the Companion follow.
----------------------------------------------------------------------
You ask about the statement on p.73, of the Companion, lines 9-10, in
the notes to follow the section ending on p.139 of Lang, that "all but
one condition" for preservation of short exact sequences had been
proved in section III.2.
The results I meant were Propositions III.2.1 and III.2.2, pp.131-132.
For instance, in the case of Hom(X,-), if we restrict the result of
Proposition III.2.2 to short exact sequences, i.e., put a "->0" at the
right end of the given exact sequence, then the proposition as it
stands shows that the sequence of hom-sets shown at the bottom of the
page is exact -- but not that it would remain exact if "->0" were put
at the end. And indeed, that is not true for general modules X. But
that exactness condition is exactly the content of projectivity of X.
Similarly, if Prop.III.2.1 is applied to short exact sequences, it
fails to show one of the conditions that would be needed to preserve
exactness, but that one condition is the content of the statement "Y is
injective".
----------------------------------------------------------------------
Regarding the existence of bases for all vector spaces (p.139,
Theorem III.5.1) you ask
> ... Has anyone ever described an uncountable basis for a vector
> space? ...
Sure! Consider the space of real-valued functions on the real line with
finite support (i.e., equal to zero except at finitely many points).
It has a basis consisting of the elements that have value 1 at a single
point, and 0 everywhere else.
For a slightly less obvious example, consider the space of step
functions on [0,1]; i.e., functions f such that you can divide
[0,1] into finitely many intervals (let's say right-half-open, for
concreteness, i.e, of the form [a,b) with 0\leq a < b <1, or of
the form [a,1]) so that f is constant on each of these intervals.
A basis for this space is given by the functions that are 0 on an
interval [0,a) and 1 on the complementary interval [a,1]
(a\in [0,1)).
Still more challenging -- but more natural-seeming -- is the space of
piecewise linear continuous functions on R or on [0,1]: functions
that are of the form f(x) = ax + b on each of finitely many intervals
into which one divides the domain, with values agreeing where two
intervals meet (e.g., the function |x|). You might see whether you
can find a basis for that.
But the really natural cases -- trying to find a basis for R as
a Q-vector-space, or for K^N as a K-vector-space where K is a
field and N the set of natural numbers -- are probably impossible
to do without using the Axiom of Choice, in the sense that it can
probably be proved that the non-existence of such bases is consistent
with Zermelo-Frankel set theory without that Axiom. A logician should
know for sure whether this is so.
> ... Or even an uncontable linearly independent subset of a vector
> space?
This one can do even for the examples just mentioned. To construct
such an example in K^N, let F be the set of all finite subsets of
N, and let f: N --> F be a bijection, say the map sending each
nonnegative integer n to the set of i such that the binary
expression for n has a 1 in the "2^i's column". Let P(N) be the
set of _all_ subsets of N. We define a map x: P(N) -> K^N as
follows: Given S\in P(N), for each n\in N we let x(S) have value
1 at i if f(i)\subset S, 0 otherwise.
To show that the elements x(S) (S\in P(N)) are linearly independent,
it will suffice to show that for any finite nonempty family of them,
x(S_1), ..., x(S_n) with S_1, ..., S_n distinct, there exist
m\in{1,...,n} and i\in N such that x(S_m) has value 1 at i,
but all the other x(S_r) have value 0 there. But this is easy
to show: Some S_m will not be contained in any of the others,
and from this we can easily find a finite set T which is contained
in S_m but not in any of the others. If we take i such that
f(i) = T, we have the desired property.
And here's a different sort of construction which I am fairly confident
gives an uncountable set of real numbers linearly independent over Q,
though I haven't thought out a detailed proof. For every real number
p \geq 2, let f(p) be the real number which has decimal digit 1 in
positions [p], [p^2], [p^3], ... , and 0 in all other positions,
where for any real number x, the symbol [x] denotes the greatest
integer \leq x. The set of these f(p) should have the desired
properties.
----------------------------------------------------------------------
Regarding the discussion of "the functorial approach", pp.75-76
in the Companion (among the comments to be read the end of III.5,
p.142) you ask about the reference to "Lemma III.5.7(a)". I meant
Lemma III.5.c3(a) -- thanks for pointing this out!
----------------------------------------------------------------------
> On p.143 of Lang, after second display:
> "with b_i\in K" should be "with b_i\in A"
Thanks!
----------------------------------------------------------------------
You ask what Lang means by the equality = at the top
of p.145.
He is considering two bilinear maps: the given map V x V' --> K
(p.144, middle), and the canonical map V x V^v --> K defined by
evaluating members of V^v on members of V. He is denoting both
of them by "< - , - >". So in the equation you ask about, the
former map is meant on the left-hand side, and the latter on the right.
----------------------------------------------------------------------
You ask whether, for either of the sorts of duals considered in Lang
(p.145), the dual is isomorphic to the triple dual.
In general, no. To see this, consider (1) abelian groups of exponent
p, for p some prime, and (2) vector spaces over the field of p
elements. These are essentially the same, and the two duals are the
same in this case, so we only need to examine the situation once.
Further, an infinite-dimensional vector space over a finite field k
has the same cardinality as dimension. Now if E is an infinite
dimensional vector space with basis B, we see that to every subset S
of B we can associate an element of the dual Hom(E,k) which takes
members of S to 1\in k and other members of B to 0. The set of
such elements will have cardinality greater than that of B; so in
the infinite-dimensional case, the dimension of the dual is strictly
greater than that of the original space, and of course this continues
to hold for higher duals. So the dimension of the triple dual is
higher than that of the dual.
This shows that one does not have isomorphism in general; but over rings
that are not fields, there are some special classes of cases (other than
finitely generated free modules) where one does have it. The second
part of extra-credit exercise I.7.1 is related to this question.
There is another curious fact. Let me write "*" for either kind of
duality. Then we know that there is a natural map E --> E**;
taking E* in place of E we get a natural map E* --> E***; on
the other hand, applying the contravariant functor ( )* to the map
E --> E** gives a map E*** --> E*. If I recall, the composite
E* --> E*** --> E* is the identity, so E* can be naturally identified
with a direct summand of E***, though this is not true in general
of E and E**.
----------------------------------------------------------------------
Regarding the concept of algebraically closed field (p.178), you ask
> Does it make sense to talk about algebraically closed Division Rings,
> and if so are the quaternions an example of an algebraically closed
> division ring?
Yes to the first question, no to the second. The existence of
division rings that are algebraically closed in a strong sense
is proved in
L. G. Makar-Limanov, Algebraically closed skew fields.
J. Algebra 93 (1985), no. 1, 117--135.
But the quaternions do not have that property. For example an equation
over the quaternions that has no root is Xi - iX - 1 = 0.
Note that the standard concept of algebraically closed field F says
that all equations over the field that could be satisfied in an
extension to a field (i.e., division ring satisfying the commutative
identity) have solutions in F. The field of quaternions satisfies
some identities weaker than commutativity; e.g., everything commutes
with the square of every commutator xy-yx. One could set up a concept
of a division ring that is algebraically closed relative to the class
of division rings satisfying a given family of identities; and
conceivably, the quaternions might be algebraically closed relative
to such a class.
----------------------------------------------------------------------
You ask about the fact that Lang only refers to the cases of
characteristic 0 and characteristic a prime (e.g., on p.179,
Proposition IV.1.12).
A ring of nonzero characteristic n contains a copy of Z/nZ, and
this has zero-divisors unless n is prime. So as long as we are
interested in integral domains, the only nonzero characteristics that
will come up are the primes.
----------------------------------------------------------------------
You ask, regarding Theorem IV.2.3 on p.182, whether there is a ring A
which is not a UFD, but such that the prime elements of A[X] are
nevertheless the primes in A and the primitive polynomials irreducible
in K[X] (where K is the field of fractions of A).
I think so -- I think that it should not be hard to show that if
A is the union of a chain of UFD's, A_1 \subset A_2 \subset ... ,
then it will have the latter property. But such a union need not
be a UFD. For example, if one adjoins to a field k an indeterminate
Y, and then a square root of Y, and a square root of that
square root, etc. -- getting a ring that could be written
k[Y, Y^{1/2}, Y^{1/4}, Y^{1/8}, ... ], then Y cannot be written
as a product of irreducibles.
Don't have time to think through the details, though. The grading of
the second Math 113 midterm has put me way behind on preparation.
----------------------------------------------------------------------
Regarding Lang's comment on p.183, "It is usually not too easy to
decide when a given polynomial (say in one variable) is irreducible",
you note that the integral root test provides an algorithm for finding
linear factors, and you ask whether there are similar algorithms for
higher-degree factors.
I don't know much about the subject. To say there is an algorithm
is not to say it is easy -- there is an algorithm for factoring an
integer N, namely "Test every integer \leq \sqrt N to see whether
it is a factor", but we know that for large N that is a lot of work.
In teaching Math 114 a few years ago, I did think about the question
of finding not necessarily linear factors for a polynomial with
integer coefficients, and came to the conclusion that at least there
are only finitely many that need to be tested. This is sketched as
exercises 3.18 and 3.19 (pp.2-3) in my packet from that course,
http://math.berkeley.edu/~gbergman/ug.hndts/m114_IStwrt_GT3_exs.ps .
----------------------------------------------------------------------
You ask how Lang gets a contradiction at the end of the proof of the
italic statement on p.192.
Your second guess is right -- he is implicitly performing induction
on n. Thanks for pointing this out; I'll add the explanation to
the Companion.
----------------------------------------------------------------------
You ask (cf. Prop. V.1.4, p.225, Prop. V.1.6, p.227, and Companion,
bottom of p.84 and top of p.85):
> Is there a way to characterize when a field extension
> $k(a_1, \cdots, a_n)$ is the same as its subring
> $k[a_1, \cdots, a_n]$?
This will happen if and only if a_1, ..., a_n are all algebraic
over k; but the simplest proof I can see requires methods that
it would be messy to develop at this point. Roughly, the argument
is as follows: Suppose K = k[a_1, ...,a_n] is a field, but that
a_1, ..., a_n are not all algebraic over k. Let us rearrange these
generators, if necessary, so that a_1 is transcendental over k.
Then, if not all of a_2,...,a_n are algebraic over k[a_1], rearrange
them so that a_2 is transcendental, etc.. Renaming the transcendental
elements x_1,...,x_p, we can thus write our field in the form
k[x_1,...,x_p, b_1,...,b_q] where p > 0, x_1,...,x_p are
algebraically independent over k, and b_1,...,b_q are algebraic
over k[x_1,...,x_p].
Note that within the field K = k[x_1,...,x_p, b_1,...,b_q], the
subfield L = k(x_1,...,x_p) generated by k[x_1,...,x_p] will be
isomorphic to the field of fractions of k[x_1,...,x_p], and that the
latter ring is (up to isomorphism) the ring of polynomials in
x_1,...,x_p.
Now since b_1,...,b_q are algebraic over the subfield L,
[K:L] is finite, hence in terms of some basis of K over L, the
operations of multiplying by each element of K can be written
as a matrix over L; i.e., K can be identified with a ring of
matrices over L. Now when b_1,...,b_q are written as such matrices,
only finitely many of the irreducibles in the polynomial ring
k[x_1,...,x_p] can occur in the denominators of the finitely many
entries of these finitely many matrices. From this, one can derive a
contradiction to the fact that every element of k[x_1,...,x_p]
becomes invertible in K.
----------------------------------------------------------------------
You ask about the difference between diagrams (2) and (3) on p.228.
I think that in (2), by making E close to k and EF close to F,
Lang is trying to focus your attention on the extensions E/k and
EF/F, and thus to suggest statement (2) on the preceding page,
"if E/k belongs to \cal C, then so does EF/F". On the other
hand, diagram (3) represents the situation of condition (3), which
is symmetric in the extensions E/k and F/k.
----------------------------------------------------------------------
You ask why, on p.230, 3rd line of paragraph preceding Lemma V.2.2,
Lang can say that EF is the field of fractions of E[F].
Good point. First, when he says it "is" the field of fractions, he
means that it is isomorphic, as an extension ring of E[F], to that
field of fractions. To show this, let us denote the field-of-fractions
construction by "ff". Then the universal property of ff(E[F])
says that given any commutative ring R with a homomorphism of E[F]
into it, under which all nonzero elements of E[F] become invertible,
there is a unique extension of this homomorphism to a homomorphism
ff(E[F]) --> R. In particular, since EF is by definition a field
containing E[F], we get such a homomorphism h: ff(E[F]) --> EF.
Since E[F] is generated as a field by its subfield E[F], the image
of h can't be a proper subfield, so h is onto; and since ff(E[F])
is a field, and homomorphisms from fields to nontrivial rings are
one-to-one, h is one-to-one. So it is an isomorphism.
The general fact, of which this is a particular case, is that for
any integral domain A, ff(A) is, up to isomorphism, the unique
field containing (an isomorphic copy of) A, and generated by that
subring. I should probably put that into the Companion.
----------------------------------------------------------------------
> Supposing that we are interested in finding extensions of an arbitrary
> ring A that contains roots of certain polynomials. Are there
> conditions when we can do a similar construction as k[x]/(p(x))
> as in the bottom of page 230 in lang?
We can always adjoin generators and impose relations; i.e., form
a polynomial ring A[X_1,...,X_n] (if we want to adjoin an n-tuple
of elements satisfying some equations) and divide this by an ideal I.
If we want the resulting ring to have no zero-divisors, then we
need to make sure I is a prime ideal, which can be tricky (even
when n is 1) unless A has good properties, e.g., is a UFD.
If in fact A has no zero-divisors and we want our extension ring to
have the same property, then the simplest way to do this will often be
to form the field of fractions k of A, construct an extension
k[x]/(p(x)) as in Lang, which, as long as p(x) is irreducible
in k[x], is guaranteed to have no zero-divisors, and then take
the subring of this generated by A \subseteq k and the image of x.
----------------------------------------------------------------------
You ask about Lang's proof of Lemma V.2.2, p.231.
The key fact is the preceding comment to the effect that EF is the
field of quotients of the set of elements a_1 b_1 + ... + a_n b_n.
That means that an element is in EF if and only if it can be written
as a fraction with one such element in the numerator and another
such in the denominator, and (implicitly), the same statement for
sigma(E) sigma(F). Lang now observes that sigma takes expressions of
that form in elements of E and F to the corresponding expressions in
elements of sigma(E) and sigma(F). So on the one hand, it carries
EF into sigma(E) sigma(F); moreover, it goes _onto_ that subfield,
since every element of that subfield has the form shown on the
right-hand side of the display in the proof.
----------------------------------------------------------------------
Regarding p.237, proof of Theorem V.3.3 you write
> Here Lang seems to prove NOR1 => NOR3 => NOR2 in the first paragraph
> of the proof, and NOR2 => NOR1 in the second paragraph. So, is it true
> that his third paragraph on NOR3 => NOR1 is excessive, ...
Right! Good point.
----------------------------------------------------------------------
Regarding the proof of Theorem V.3.4, p.238, you note that
> Lang says several times let \sigma be an embedding, where it is
> understood that \sigma is an embedding in an algebraic closure. Only
> once does Lang say this though. Is this particular to this proof, or
> when dealing with normal extensions, or will this be a common
> occurrence?
Thanks for pointing this out. It's not any standard usage, that I
know of, but we should see how extensively Lang does this.
----------------------------------------------------------------------
You ask about the inequality "# of automorphisms \leq # of embeddings"
at the top of p.95 of the Companion (which I note in relation to
material on Lang's p.240).
Since E lies in k^a, every automorphism of E is, in particular, an
embedding in k^a. (Or if one wants to be formal, every automorphism,
when composed with the inclusion map E --> k^a, gives an embedding.)
----------------------------------------------------------------------
You ask how comparison of degrees in the 3rd display on p.248
gives [k(\alpha): k(\alpha^p)] = p.
There \alpha has minimal polynomial f, and \alpha^p has
minimal polynomial g. Now [k(\alpha): k(\alpha^p)] =
[k(\alpha): k] / [k(\alpha^p): k] = deg f / deg g. And this
ratio is p.
I've listed these three steps briefly, not knowing which of them
may have been your source of trouble. Let me know if the reason(s)
for any of them is/are not clear.
----------------------------------------------------------------------
You ask why, on p.251 in the next-to-last sentence of the proof
of Corollary V.6.10, E is separable over E^p k.
Because the pth power of every element \alpha of E lies in that
field. (Hence \alpha satisfies a polynomial, X^p - \alpha^p,
with coefficients in that field which has only one root in an
algebraic closure; hence the minimal polynomial of \alpha over
E^p k, which must divide every polynomial over that field which
\alpha satisfies, has only one root in the algebraic closure.)
----------------------------------------------------------------------
You ask why "perfect fields" (Lang, p.252) are so called.
It's based on thinking of inseparability as a flaw -- something that
makes it impossible to apply Galois Theory. Thus, not having any
inseparable extensions is a "good" feature. Among the various words
having the connotation "good", the one that was chosen for this
feature, probably more or less at random, was "perfect".
It may be that it wasn't entirely random. The original meaning of
"perfect" was "complete". (Cf. the grammatical term "perfect tense",
referring to verb forms like "has eaten", that say that an action has
been completed.) So it could mean that the property "All elements
of k^a that can't be distinguished from elements of k by the
behavior of automorphisms _are_already_in_ k" was thought of as a
"completeness" property of such fields k.
----------------------------------------------------------------------
Regarding the definition of a Galois extension (p.262) you ask how it
was discovered that separability and normality were the conditions
under which Galois Theory could be developed.
I don't know the detailed history of the subject, but I know that
its original form was very different from the way it is presented
today. Galois and the other early workers didn't think about groups of
automorphisms of field extensions, but groups of permutations of the
roots (in field of complex numbers) of a polynomial, which preserved
all algebraic relations satisfied by those roots. (So, for instance,
if one writes the four roots of X^4 - 2 as alpha, i alpha,
-alpha, and -i alpha, then the transposition that interchanges
the first two roots does not preserve the relation saying that
the first and third roots sum to zero and so does not belong to
the Galois group as they understood it, while the transposition that
interchanges the first and third does preserve all relations; it is
what we would view as the restriction to the set of roots of a certain
automorphism of the extension.) Since they were dealing with all
roots of a polynomial, their groups of permutations of the roots
corresponded to what we would see as groups of automorphisms of the
splitting field of the polynomial, which is automatically a normal
extension.
On the other hand, since they were working within the complex numbers,
which is of characteristic 0, separability was automatic. I assume
that the need for separability was discovered when people tried to
generalize the original results to arbitrary fields.
One may ask: If the early workers viewed Galois Theory in terms of
roots of equations rather than field extensions, how did they express
the correspondence between subgroups and subfields? Again, I don't
know the details, but I would guess that they looked at algebraic
expressions in the set of roots, and for each subgroup of the
permutations, considered those expressions invariant under that
group. The set of complex numbers that could be represented by such
expressions would (in modern language) describe a subfield of the
splitting field.
What would be the motivation for passing from "groups of permutations
of roots" to "groups of automorphisms of field extensions"? Probably
the realization that any two polynomials whose roots generated the
same field had "the same" Galois group; hence that it was really a
function of the extension, and not just of the particular polynomial.
----------------------------------------------------------------------
Concerning Lemma VI.2.c1 on p.102 of the Companion (discussion
relating to p.270 of Lang), you ask whether we know that the
expression for a symmetric polynomial in terms of the elementary
symmetric polynomials generate is unique.
That's the italicized statement on p.192, saying that the elementary
symmetric polynomials are algebraically independent. It means
that if we map the ring of polynomials in n indeterminates,
k[X_1,...,X_n] to k[t_1,...,t_n] by the homomorphism sending each
X_i to s_i, the map will be one-to-one. Hence each member of
the image of that homomorphism is the image of a unique element
of k[X_1,...,X_n], i.e., has a unique expression in terms of
s_1,...,s_n.
----------------------------------------------------------------------
You ask about the assumptions of characteristic not 2 or 3 in
Example 2, p.270.
You are right that under the assumption that f is irreducible of
degree 3, only "characteristic not 3" is needed to insure that f is
separable. However, the assumption that the characteristic is not 2
is needed to conclude in the next paragraph that any odd permutation
of the roots will move \delta. (It is always true that an odd
permutation sends \delta to -\delta; but in characteristic 2
\delta and -\delta are the same.) So though Lang mentions the
"not 2" assumption in an inappropriate place, that assumption is indeed
needed for the analysis to work.
Note that since S_3 has a normal series with factors Z_3 and Z_2,
analysis of the Galois theory of a separable cubic involves considering
successive Galois extensions with those two Galois groups. Extensions
with Galois group Z_3 behave differently in characteristic 3 from
other characteristics, while those with Galois group Z_2 behave
differently in characteristic 2 from other characteristics, so both
characteristics must be excluded to get the behavior that occurs in
"most" cases. We'll learn a lot about extensions with Galois group
Z_p, both in the "characteristic not p" and the "characteristic p"
cases, in section VI.6.
----------------------------------------------------------------------
You ask how the restrictions on the characteristic guarantee
separability in Example 2, p.270.
I'm surprised that I can't find a really explicit statement of the
fact in question in Lang. Anyway, it can be seen from Prop.V.6.1,
p.247, that an irreducible polynomial over a field k can be inseparable
only if k has characteristic p, and then only if the degree of the
polynomial is divisible by p. (This is pointed out explicitly in the
Companion, in the second paragraph of my comment on that page.)
----------------------------------------------------------------------
You ask how to get the subfields of the field discussed on p.271
corresponding to each subgroup shown on that page, noting that
some of these subgroups do not leave any roots of X^4 - 2 fixed.
I hope to show how to do this some time soon in class. But you
should be able work it out yourself. Remember that the typical
element of Q(alpha, i) is of the form a + b alpha + c alpha^2 +
d alpha^3 + e + f alpha i + g alpha^2 i + h alpha^3 i. Given
any one of those subgroups, it is easy to write down conditions
for the above expression to be fixed by the elements of that
subgroup. It sounds as though you were looking for members of the
basis {1, alpha, alpha^2, alpha^3, alpha i, alpha^2 i, alpha^3 i}
that were fixed; but the subspace of elements in a vector space
V fixed by a group of linear maps need not be spanned by those members
of a given basis of V that happen to be fixed by the group. Perhaps
what was misleading was that in this case, the basis is sufficiently
nicely chosen so that the subspaces corresponding to some of the
subgroups _are_ spanned by subsets of the basis. But not all.
----------------------------------------------------------------------
You ask why, in Example 7 on p.274; the polynomial X^5 + X + 1 is
irreducible in F_5.
This follows from Theorem VI.7.4(ii), p.290. (As I say in the
Companion, this example uses later material). Specifically, we know
that X^5 - X is identically zero on F_5, so it never assumes the
value 1, so X^5 - X - 1 has no root in that field, hence by that
theorem, it is irreducible.
----------------------------------------------------------------------
You ask how Gauss's Lemma is applicable in the 3rd line of the
proof of Theorem VI.3.1, p.278.
Note that since ord_p(1) = 0 for all p, a polynomial f having a
coefficient equal to 1 must have ord_p f \leq 0 for all p. Hence
if the product of two such polynomials has coefficients in our given
UFD (here, Z), i.e., has ord_p \geq 0 for all p, we see from
Gauss's Lemma that these orders are all 0; in particular, the
polynomials have coefficients in our UFD. Thus, a factorization of a
monic polynomial over Z into monoic polynomials over Q must in fact
be a factorization into polynomials over Z.
----------------------------------------------------------------------
Concerning the definition of "character" (p.282), you ask why they
are so named.
There is seldom a solid answer to why some everyday word was chosen
as the name of a mathematical concept; but I can say a bit about the
more general meaning of "group character" of which the usage Lang gives
is a special case.
If a finite group G acts by vector-space automorphisms on a
finite-dimensional vector-space V, this is called a "(linear)
representation of G". This can be thought of as a module over the
group algebra kG, and I'll use that viewpoint because we have learned
basic module-theoretic language (though group theorists most often
speak directly about representations without going via the language of
modules). Now by choosing a basis for V, we can represent the action
of each element of G by an invertible matrix over k; but this map
to the matrix ring isn't an invariant of the representation because it
depends on the choice of basis. However, the function that associates
to each member of G the _trace_ of the corresponding invertible
matrix is independent of choice of basis, and turns out to carry very
powerful information about the representation. It is this that
group-theorists call the "character" of the representation. (See Lang,
section XVIII.2.) I suppose the idea behind the choice of that word
was "fundamental feature".
The most important representations that group theorists study, the
"irreducible" representations, are the simple kG-modules (modules with
no proper nonzero kG-submodules). If G is abelian, it turns out that
all irreducible representations over an algebraically closed field are
1-dimensional. In this case, such a representation is just a
homomorphism of G into the multiplicative group of 1-by-1 matrices,
which is the multiplicative group of k, and the trace of such a
matrix is just its one element; so characters of such a group are just
homomorphisms into the multiplicative group of k. That is the
restricted meaning of the term that Lang introduces on p.282.
----------------------------------------------------------------------
You ask about the last line of p.284, saying that for an inseparable
extension, the trace function is 0.
To see this, remember that if [E:k]_i > 1, then [E:k]_i is a power
of the characteristic. Hence as an element of the field, it is 0.
----------------------------------------------------------------------
> ... p.287 of Lang middle of the page 3rd display from the bottom
> "The polynomials (f(x)/(x-\alpha_i))*(\alpha^r)/(f'(\alpha))
> are all conjugate to each other."
>
> What does it mean for polynomials to be conjugate to each other?
On p.243, second paragraph, Lang defined elements to be conjugate if
they were images of the same element under various embeddings. In
the case of a Galois extension, this is equivalent to saying that they
are in the same orbit under the action of the Galois group. Here he is
implicitly extending this action of the Galois group to polynomials,
by letting members of the group act on the coefficients; thus he means
that these polynomials lie in the same orbit under that action.
----------------------------------------------------------------------
Regarding the last display on p.287, you ask why the \alpha there
has no subscript. (Actually, this applies to the last two displays.)
The operator "Tr" sends its argument to the sum of its images under
all the embeddings of k(\alpha) in k^a. These embeddings take
\alpha to \alpha_1, ..., \alpha_n respectively. It is true that
(assuming k^a is taken to contain k(\alpha)) the element \alpha
will be one of those \alpha_i, and we could arbitrarily choose it
to be \alpha_1, but that isn't relevant here. Since we started with
\alpha in the statement of the theorem, we may as well write it here,
knowing that when Tr is applied to the expression, we will get a
sum of expressions involving the \alpha_i with which we have been
computing above.
----------------------------------------------------------------------
You note that Lang hasn't spelled out the steps of the proof of
Prop.VI.7.1, p.291.
Right. That means that he expects that you can fill in the details.
You first need to check the definition of "distinguished class", if
you aren't sure of it, and note the assertions that need to be proved,
and what they mean in this case. Then go through them and see whether
his proof contains what is needed to get them. If you are uncertain
about some point, then ask about it!
----------------------------------------------------------------------
Regarding ruler-and-compass constructions (Companion, note on p.293),
you mention reading in Dummit and Foote that one can construct with
a ruler on which a unit distance is marked things that one cannot
construct with an ordinary ruler and compass, and ask how this is
possible.
Such constructions are implicitly based on the assumption that one can
use such a tool in a certain way: Put one mark of the ruler on some
line, move the ruler so that that mark moves along the line, while the
edge of the ruler always passes through a specified point (e.g., by
putting a peg at that point and sliding the ruler against the peg), and
tracing out the curve that the other mark on the ruler moves through.
Finally, it is assumed that one can find the point where this curve
intersects another line or curve.
This is a plausible interpretation of how one could use such a ruler,
but it bothers me that it is taken for granted, rather than stated
explicitly. Perhaps there is a touch of showmanship in descriptions
of what one can do with this construction -- surprising the reader by
bringing in an ingenious real-world image of how a ruler might be used.
In any case, one cannot do this with ordinary straightedge and compass.
One could find as many points as one likes on the curve referred to
above; one could approximate its point of intersection with, say, a
line, to any degree of accuracy by repeatedly constructing more points.
But in a finite number of steps one couldn't find the exact point of
intersection with that line.
----------------------------------------------------------------------
You ask why, in the proof of Theorem VI.8.2, p.295, the assumption that
B_2/(k^{*m}) is finitely generated implies that it is finite.
Because it has exponent m. (The m-th power of every element of B_2
is, in particular, the m-th power of an element of k*, hence it is
in the "denominator" of B_2/(k^{*m}); so every element of the
factor group has m-th power 1.) In an _abelian_ group generated by
finitely many elements x_1, ..., x_r of nonzero exponents
n_1,...,n_r, every element can be written x_1 ^{a_1} ... x_n ^{a_n}
with 0\leq a_i < n_i, so such a group is finite.
----------------------------------------------------------------------
You ask why, in the 6th line of the proof of Theorem VI.8.2, p.295, one
has k((B_2)^{1/m}) = k((B_3)^{1/m}).
Since B_2 is contained in B_3, we have k((B_2)^{1/m}) contained
in k((B_3)^{1/m}). On the other hand, it was assumed two lines
earlier that k(b^{1/m}) is contained in k((B_2)^{1/m}), and the
reduction to the case B_2 finitely generated was done so as to
preserve that property, so it is still true on this line. Since
B_3 is generated by B_2 and b, we see that k((B_3)^{1/m}) will
be contained in k((B_2)^{1/m}).
----------------------------------------------------------------------
You ask why, in the first line of the second paragraph proof of
Theorem VI.8.2, p.295, the injectivity (one-one-ness) of the map from
groups to field extensions follows from the first paragraph of the
proof.
Lang proves there that any inclusion of fields k(B^{1/m}) implies an
inclusion between the corresponding groups B. Since equality of
two sets is equivalent to inclusions both ways, equality of two
fields of the form k(B^{1/m}) implies equality of the groups. That
says that the map from the groups to the fields is one-to-one.
----------------------------------------------------------------------
You ask whether a polynomial X^n - a (Lang, p.297) can split into
linear factors some of which are repeated but others not.
No. Let p be the characteristic of k. We know that if p|n,
then over the algebraic closure of k, X^n - a factors as
(X^{n/p} - a^{1/p})^p, and factoring the factor X^{n/p} - a^{1/p}
into linear factors, we see that each of these linear factors is a
repeated factor of X^n - a. On the other hand, if p does not divide
n, then X^n - a and its derivative are relatively prime (it is easy
to write a, and hence 1, as a linear combination of them), so
by Proposition IV.1.1, there are no multiple roots.
----------------------------------------------------------------------
Regarding corollary VI.9.3, p.299, you ask
> Is it usually possible, given an algebraically closed field K of
> characteristic 0, to find a subfield k such that K/k is of degree 2?
> It seems that one could look at a finite group of automorphisms of K
> and try to find a subgroup of order 2, but the problem is whether this
> finite group exists, and if it does, whether it has a subgroup of
> order 2. ...
It follows from Corollary VI.9.3 that if the Galois group of K/k
has any finite nontrivial subgroups, then all such subgroups are
of order 2 !
Typically, there will be many such automorphisms. For instance, note
that the rational function field Q(X) can be embedded in the complex
numbers by sending X to a real transcendental number, to a pure
imaginary transcendental complex number, or to one that is neither
(and that in the last case, there are three possibilities as to whether
the real and/or imaginary part of the image of X is transcendental).
If we let K be the algebraic closure of Q(X), then our embedding
of Q(X) in C must extend to an embedding of K in C. In the
cases where the image of X is real or pure imaginary, the image of
Q(X) will be closed under complex conjugation, and one can show that
the same will be true of the image of K. (One can also do this in
some cases when the image of X is neither real nor pure imaginary
-- probably whenever its real and complex parts are algebraically
dependent, though I haven't thought this through.) Hence complex
conjugation induces an automorphisms of K via each of these
embeddings; but these automorphisms are different, because one fixes
X while the other sends it to -X. An automorphism with the former
property is conjugate to an automorphism with the latter property via
an automorphism of K that sends X to iX. I don't know whether all
order-2 automorphisms of K are conjugate.
----------------------------------------------------------------------
You ask about the statement on the top line of p.301, that the subgroup
shown at the bottom of the preceding page is the commutator subgroup.
I can't make sense of the explanation Lang gives ("because the factor
group ..."). Here's what I've been able to work out.
Though Lang says "for arbitrary n" two lines above the display of
that subgroup, we should take that only to mean that he is no longer
restricting to prime n; but still assuming n odd. Note that
for n odd, 2\in (Z/nZ)*. Now let A be the matrix one gets from
the expression "M = ..." in the third display on p.300 by taking b = 0
and d = 2, and let B be the matrix one gets by taking d = 1 and
any b. Then you will find that A B A^{-1} B^{-1} = B, showing that
B is in the commutator subgroup, as Lang claims.
This does not remain true if n is even. In that case, every
d\in (Z/nZ)* is odd, and if you compute commutators, you will
find coefficients "d-1" on the elements in the lower left position
of your matrices, from which you can deduce that every member of
the commutator subgroup is of the form at the bottom of p.300 with
an _even_ entry b; so we don't get all the elements he claims.
----------------------------------------------------------------------
You asked about results such as the proof that R(sin\theta, cos\theta)
is pure transcendental, noted in the Companion, p.126, in the note
to p.357 of Lang.
I don't know much about such results, but there is a heuristic that
leads to the result in that case. Note that the field is generated by
the coordinate functions on the circle. Take it for granted that what
one wants is equivalent to a way to parametrize points of the circle by
a single real parameter, using rational functions. If one tries
something "obvious" like intersecting the circle with vertical lines
differing in their x-coordinates, one runs into the trouble that every
line hits the circle twice; so the point of intersection is the
solution to a quadratic equation, which is not a rational function of
x. However, suppose instead that one fixes a point of the circle, say
(-1,0), and intersects the circle with lines passing through that point,
parametrized by their slope. There will still be two solutions for
every slope, but one of these is known, the point (-1,0), and when
one knows one solution of a quadratic equation, one can find the other
solution by ring operations (without using square roots). And this
indeed gives a parametrization of the circle by rational functions of a
single variable. (The point (-1,0) itself appears when that variable
goes to infinity.) Moreover, by some elementary geometry, the angle
that line makes with the x-axis is half the arc cut off between (1,0)
and its other intersection with the circle. If we call the general
point of the circle (cos\theta, sin\theta), this says that the slope
of the line is tan(\theta/2). This explains the fact that one can
express sin\theta and cos\theta as rational functions of
tan(\theta/2) and vice versa.
Algebraic geometers use this principle in other ways. If one has
a cubic curve in the plane, a line will in general intersect it in
three points. But if the cubic has a double point, then a line through
that point can be considered to intersect it twice there, and will have
just one other point of intersection with it. This leads to the fact
that a cubic with a double point has a rational parametrization, while
other cubics do not.
----------------------------------------------------------------------
You ask how one would deduce, as I indicate in the Companion, (p.126,
note to follow end of VIII on p.357 of Lang) that real or complex
functions e^{a_1x}, ... e^{a_nx} are algebraically independent if and
only if a_1, ..., a_n are linearly independent over Q, from the
assertion that e^{c_1x}, ... e^{c_mx} are linearly independent if
and only if c_1,...,c_m are distinct.
If R is a commutative ring containing a field K, the condition that
a family of elements x_1,...,x_n of R is algebraically independent
over K is equivalent to saying that the family of all monomials
in these elements, (x_1^{d_1}...x_n^{d_n})_{d_1,...,d_n\geq 0},
is K-linearly independent. Now the monomials in the functions
e^{a_1x}, ... e^{a_nx} are the functions
e^{(d_1 a_1 + ... + d_n a_n)x} (d_1,...,d_n\geq 0), and from the
assertion quoted above, these are linearly independent if and only
if the expressions d_1 a_1 + ... + d_n a_n (d_1,...,d_n\geq 0) are
all distinct, which holds if and only if a_1, ..., a_n are linearly
independent over Q. (That last "if and only if" takes a little
thought. Given a linear dependence relation relation over Q, one
can clear denominators, and move the negative terms to the other
side of the equation, to get an equality of linear combinations with
nonnegative integer coefficients.)
----------------------------------------------------------------------
Regarding my comment in the Companion on the proof of Theorem
A2.1.3 (p.877), where I say that by making infinitely many choices of
elements, Lang implicitly uses the Axiom of choice, and that even
making the needed choices as a countable sequence and saying "and so
forth" would implicitly require that axiom, you ask why induction
doesn't suffice.
The best I can do is offer an analogy: There are uncountably many
sequences of 0's and 1's, so it is not true that for every such
sequence, there is a finite computer program that (running on a machine
that never wore out) would give that sequence. Yet for every such
sequence and every n, there is certainly a computer program that will
give the first n terms of that sequence; e.g., you can include those
terms in the program, and just have it spew them out. Here, similarly,
having every finite case "doable" doesn't mean -- without the
assumption of the Axiom of Choice -- that the whole task is doable.
As for induction being applicable -- all that induction tells us is that
for every n there exists an n-tuple (x_1,...,x_n) with x_i\in X_i;
it doesn't say that there exists a natural-numbers-tuple (x_1,...)
with that property.
Not being an expert in the field, I can't give a precise explanation,
but my understanding is that set theorists, given a model of set theory
in which the Axiom of Choice holds, can construct within it a model
in which that Axiom fails -- that is, they define certain sets and only
those to be the "sets" of that theory, and verify that this system
satisfies the other Zermelo-Fraenkl axioms, but not Choice.
----------------------------------------------------------------------
You ask, with respect to the proof of the Axiom of Choice from Zorn's
Lemma, "what chain is {p}?"
As stated on the first page of the material on the Axiom of Choice
etc., "A subset S of a partially ordered set P is called a
chain if it is totally ordered under the induced ordering." So
any one-element subset of a partially ordered set is a chain, by
definition, since it has no incomparable elements.
----------------------------------------------------------------------
You ask about the statement you had heard that "everything is equivalent
to the axiom of choice".
My first reaction was "Nonsense!", but thinking a little further, I
have a guess as to what was meant: It is that every appropriately
general statement that requires the axiom of choice for its proof is
in fact equivalent to that axiom. As an example of what I mean by
"appropriately general" -- the statement "there exists a well-ordering
of the real line" requires the axiom of choice to prove it, but it is
not equivalent to the axiom, but only to the case of that axiom where
certain of the sets involved have cardinality _< that of the real line.
But the statement "all sets can be well-ordered" is, as shown in the
reading in the Companion, equivalent to the axiom of choice. I believe
that, similarly, the major results of algebra that one proves using
the axiom of choice -- that every vector space has a basis, that every
ring has a maximal ideal, etc. -- all turn out to imply the axiom as
well. Loosely, the statement you heard means "If one can reasonably
ask whether something is equivalent to the axiom of choice, the answer
will be yes."
Of course, even so interpreted, the statement is doubtless an
exaggeration; I am sure set-theorists can come up with statements that
are strictly weaker than the axiom of choice without being restrictions
of more natural statements that are equivalent to that axiom. But it
is nonetheless an impressionistic summary of the repeated outcome when
one sees a result proved from the axiom of choice, and raises the
question of whether the implication is reversible.
----------------------------------------------------------------------
You're right that where Lang writes "card(M) <= card(A)" and
"by Bernstein's Theorem" near the top of p.889, neither is needed!
Shall include this in my next set of errata to him.
----------------------------------------------------------------------
You ask whether there is a way to explain why the one-word proof of
Corollary A2.3.7 (p.889), "Induction", doesn't work in the infinite
case.
I'm not sure in how deep a sense you are asking this. That proof
doesn't work in the infinite case simply because mathematical induction
is a technique that gives you results for all natural numbers (finite
ordinals) but not for infinite ordinals. If one thinks it should
extend to the first infinite ordinal, one gets absurd results; e.g.,
though induction using the statement "if n is finite, then so is
n+1" gives the correct fact that every natural number is finite (a
tautology, but that's beside the point), if we claimed it extended to
infinite ordinals, it would give the false conclusion "the first
infinite ordinal, omega, is finite".
There are techniques having somewhat the role of induction for infinite
arguments, of which we have learned one, Zorn's Lemma. But one doesn't
get the infinite case for free; one has to prove something more than is
needed for prove things in the finite case (otherwise, as I have said,
things true in the finite cases would all be true in the infinite case,
which simply isn't so). In Zorn's Lemma, this is the condition "the
partially ordered set is inductive". In any proof of this sort that
would give the result of Corollary A2.3.7 for denumerable products, this
step would be a description of how the products over larger sets can be
constructed from products over smaller sets, and a proof that this
method of construction preserved cardinalities.
There is a construction that gives large (e.g., denumerable) products in
terms of smaller (e.g., finite) products; it is called "the inverse
limit". (This is developed in Lang section I.10; we don't have time to
cover it in 250A and I don't have any material on it in the Companion.
It is covered in sections 7.1-7.2 of my Math 245 notes; these require
the basic concepts of category theory, which we will get in sections
I.11-I.12 of Lang.) But inverse limit does not preserve denumerability!
Exercise A2:L13 shows that many cardinalities, are indeed preserved by
denumerable products, and Lang originally asked whether this was true
of all sufficiently large cardinalities. But as his added note shows,
this is not so; details are sketched in the optional material in the
Companion.
----------------------------------------------------------------------
In relation to Corollary A2.3.7, p.889, you ask
> How would one go about finding the cardinality of the direct product
> of an infinite set A with itself denumerably many times?
There isn't a real sense in which one can "find" this, because the
properties of the arithmetic of infinite cardinalities depends on
the axioms of set theory (in addition to the standard ones) that one
assumes. But for some partial information, see Lang's Exercise A2:L14.
----------------------------------------------------------------------
For A an infinite set, you ask how large a set B must be so that
the cardinality of the set of functions from A to B would be
larger than that of the set of functions from A to {0,1} (cf.
Theorem A2.3.10, p.890).
B would have to be bigger than the set of subsets of A. To see
this, let us use the notation of the arithmetic of ordinals as in
Exercise A2:L8. Calling the cardinalities of A and B alpha and
beta, we see that if, on the contrary, beta _< 2^alpha, then
beta^alpha _< (2^alpha)^alpha = 2^(alpha x alpha) = 2^alpha. On the
other hand, for beta > 2^alpha, we get beta^alpha >_ beta > 2^alpha.
----------------------------------------------------------------------
Regarding Corollary A2.3.11, p.891, you ask
> Given an infinite set A and a set B of cardinal less or equal to A,
> what is the relation between the cardinal of A and the cardinal of
> the subsets of A of cardinal equal to B?
It is easy to see that it lies between the cardinality of A and
the cardinality of the set of subsets of A, and that it is at
least the cardinality of the set of subsets of B. Beyond that,
I don't know what one can say.
----------------------------------------------------------------------
You write that in the proof of the Schroeder-Bernstein Theorem (p.885),
the division of A as A_1 union A_2 seemed "an artifice for the
proof" rather than the intuitive argument you had been expecting.
Well, there is a neat intuitive idea, and then there is an artifice
needed to make it work out. The idea is to consider chains of
elements x, f(x), g(f(x)), f(g(f(x))), ..., alternately in A and B,
(continued backwards as well as forwards if the element on the left is
in the image of g or f respectively). This can be pictured by
drawing a red arrow from every element a\in A to f(a)\in B, and a
blue arrow from every b\in B to f(b)\in A, and looking at each
element as lying in a unique "ladder" of such arrows. Then the idea is
to get our correspondence by "switching adjacent pairs" in each
ladder. But which way to do the switch -- should we switch the pairs
connected by red arrows or those connected by blue arrows? One method
will leave the elements that are at the top of a ladder that begins
with a blue arrow unmatched; the other will do this to the element at
the top of a ladder that begins with a red arrow. (Either will work
for a ladder that has no top.) So we use one method for one type of
ladder-with-a-top, the opposite way for the other type, and make an
arbitrary choice for ladders with no top.
So -- elegant proofs involve both intuitive ideas, and technical tricks
needed to make them work!
----------------------------------------------------------------------
You ask
> ... are most set theories that do not allow [the set of all sets]
> free from inconsistency, or do they run into problems elsewhere? ...
The only axiom-systems that set-theorists are interested in looking at
are those that, so far as they know, are consistent; so "all" viable
set-theories have this property. My understanding is that set theorists
have proved various systems "consistent if arithmetic is consistent",
but know that the latter cannot be proved, but only assumed. Finally,
there are a lot of axioms that they look at (especially large-cardinal
conditions) that they strongly suspect are consistent with the existing
axioms, but for which they don't have any proofs of consistency. One
member of the Berkeley Math Department, John Silver, is a maverick in
this respect; he believes that these large-cardinal conditions probably
lead to contradictions; but so far as I have heard, he has never been
able to prove such a result.
My sense is that since Russell's paradox, set theorists have been aware
of where the dangers lie, and have been able to avoid them. But Silver
thinks otherwise.
----------------------------------------------------------------------