ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 250A Fall 2002 and Fall 2006 taught from Lang's "Algebra", 3rd ed. (The "Companion" referred to in places below is some supplementary material I provide for my students. Pagination in the Companion changes slightly from year to year, but in the answers below, I try always to include the page of Lang to which a note in the Companion refers, which should remain constant. The answers below are arranged according to the page in Lang referred to, even though in class we modify that order in places, e.g., interpolating sections from the chapter on modules between sections of the chapter on rings.) ---------------------------------------------------------------------- Regarding Lemma 0.c1 in the "Companion" (referring to p.x of Lang), you ask whether there is a general concept of algebraic structure, such that if A has such a structure, one can give criteria for such a structure to be induced on A/~, and get a similar result with "homomorphism" in place of "set map". Yes. Sets with algebraic structure in that general sense are the topic of the field variously called "universal algebra" or "general algebra". It is Math 245 here at Berkeley; cf. my course notes, http://math.berkeley.edu/~gbergman/245 . Algebraic structures in general are not formally defined until Chapter 8 of those notes, but lots of examples are given in Chapter 3 to motivate the later development. In particular, in section 3.2, it is observed (as we will see in reading #3) that the equivalence relations on a group G induced by group homomorphism are in one-to-one correspondence with the normal subgroups of G; while in section 3.10 of those notes it is shown that the corresponding result is _not_ true for monoids: The equivalence relation determined by a homomorphism is not determined by the set of elements the homomorphism sends to the identity. However, a precise description is given of which equivalence relations ~ on a monoid M are induced by homomorphisms (equivalently, which equivalence relations ~ have the property that M/~ can be made a monoid so that the canonical map is a homomorphism). These are called "congruences on M", and the analogous definition and concepts for general algebras (which are straightforward once one has seen the monoid case) are set down in Chapter 8. ---------------------------------------------------------------------- You ask about additive and multiplicative notation (Lang, p.3). As you observe, there are certain objects for which one or another operation is traditionally written with the symbol "+" or ".", and this tradition is so strong that people would not think of writing it differently; in particular, the real numbers and various related objects have two operations, one of which is traditionally written "+" and the other ".", so that changing notation for these would be very confusing. However, if one looks at a new group to which no such tradition applies, one could denote its operation by either symbol. Since traditionally-defined operations include some that are written multiplicatively but have noncommutative operation (e.g., matrices), while everything traditionally denoted "+" is commutative, one makes the convention of not using "+" unless the operation is commutative; so a commutative operation can be denoted by either symbol, while a noncommutative group or monoid operation can only be denoted "." (or, occasionally, by an ad hoc symbol such as "*".) More important than the fact that different groups can have operations written with different symbols is the fact that in studying groups, one will often be making statements concerning an arbitrary group; and since that group need not be commutative, one will generally write the operation ".". Since "arbitrary group" includes the case of groups such as R under the operation of addition, such results are applicable to these cases, so one must be able to view a statement about "." as a generic statement which includes cases written "+" (and, inversely, when one proves a result about abelian groups and writes the operation "+", one must be able to view this as including abelian groups whose multiplication is written ".".) I can't tell whether you need still more persuasion to accept "+" and "." as two possible symbols for a group operation, rather than only as names for distinct operations with distinct meanings. Let me know. (When we come to ring theory in reading #12, we will be back in the situation where "+" and "." have distinct, non-interchangeable meanings.) ---------------------------------------------------------------------- You ask about conditions for a monoid (p.3) to be embeddable in a group (p.7). Mal'cev obtained a necessary and sufficient criterion. It involves an infinite set of conditions, one obtainable from each "formation" of two kinds of brackets, "()" and "[]". (E.g., if I recall, ([)] is such a "formation".) Cf. P.M.Cohn, "Universal algebra", section VII.3. ---------------------------------------------------------------------- You ask whether the result stated in italics at the top of p.5 could be proved, alternatively, by showing that every permutation \psi is a product of transpositions of adjacent elements. Yes. Doubtless the reason that Lang didn't do it that way is that one thinks of this fact as a result in group theory (it appears as Exercise 38(b) at the bottom of p.78), so he didn't want to state it until he had defined groups, and the symmetric group in particular. However, another way to do things would certainly have been to prove this statement about transpositions here, as a lemma on the way to getting this result about commutative monoids, and then note that one already has it available "with no extra work" when one wants it in the study of the symmetric group later on. ---------------------------------------------------------------------- You ask about the term "family" on p.9, first Example, 4th-from-last line. "Family" is a general term mathematicians use to mean "elements collected together in some way". One way they can be collected is as a set; another way is as an I-tuple; so "family" can mean either of these. In this case, it means "I-tuple". ---------------------------------------------------------------------- You ask whether the square is the only shape whose symmetry group is "the group of symmetries of the square" (Lang, p.9, bottom). In a narrow sense, certainly not. If we start with a square centered at 0 in R^2, its symmetries will be a certain group G of linear maps R^2 --> R^2. Now if we take any geometric object X in the plane (e.g., a triangle), and "symmetrize" X using G, i.e., take the union Y of the images of X under all the elements of G, then in general, G will be the group of symmetries of Y. (In some special cases, Y will have more symmetries; but in "most" cases it will not.) However, one can interpret the question in other ways. One of them is "Is the above representation of the abstract group D_4 as the concrete group of linear maps of the plane that represent symmetries of the square essentially unique?" Again, the answer is no: There are faithful representations of D_4 by linear maps on R^n which are not built up in obvious ways from that representation. On the other hand, one can ask questions in which "unique" is replaced by "simplest", and then one tends to get positive answers. For instance, the smallest number n of elements such that D_4 can be represented faithfully by permutations of n elements is 4, and one can show that every faithful 4-element G-set is isomorphic either to the G-set of vertices of the square, or to the G-set of edges of the square. ---------------------------------------------------------------------- You ask what I mean by "left translation" and "right translation" in the discussion on "alias and alibi" in the Companion (note referring to Lang's p.13). What Lang calls the "translation map" T_x determined by x at the bottom of p.26 is more precisely called the "left translation map" determined by x, since it takes each y and multiplies it on the left by x. Likewise, the "right translation map" takes each y and multiplies it on the right by x, giving yx. Lang implicitly invokes these in referring to right cosets, at the top of p.27, though he doesn't give them a name. (Hmm. I should probably make the discussion of "alias and alibi" a comment to section I.5, rather than I.2.) > ... Does the problem of alias and alibi often arise in group theory? Probably less often than it used to, since group theorists are now clear that the fundamental concept of a permutation of a set should be defined as a bijective map of the set to itself. But in elementary expositions, it is still tempting to represent such actions as moving an object against a background, e.g., shifting marbles among holes; so it can still be a problem for beginners. ---------------------------------------------------------------------- You ask why, in the next-to-last line on p.13, one has xH = f^{-1}(f(x)); specifically, why the right-hand side is contained in the left-hand side. If y is an element of f^{-1}(f(x)), that means f(y) = f(x). Now, following the heuristic I noted in class, one takes this equation saying that x and y "behave alike" (under f) and transforms it into on that says that a certain element behaves like the identity. One can do this by multiplying either on the right or on the left by f(x)^{-1}. Performing those two multiplications, and using the fact that f respects products and inverses, we get, on the one hand, f(yx^{-1}) = e, and on the other hand, f(x^{-1}y) = e. These say yx^{-1}\in H, respectively x^{-1}y\in H, in other words, y\in Hx, respectively y\in xH, showing that f^{-1}(f(x)) is contained in both Hx and xH, as the equation you ask about, and the other one that Lang relates it to, require. ---------------------------------------------------------------------- You ask how we know that the homomorphisms of conditions (i) and (ii) on p.16 are unique. "Unique" in each case means "the only homomorphism that satisfies the stated conditions". In each case, write out the condition that Lang says the homomorphism is to satisfy, as an equation in group elements. I think you will see that that equation determines how the homomorphism must act. If not, write to me (or show me in office hours) how far you have gotten with the calculation, and I will show you what comes next. ---------------------------------------------------------------------- You ask about the discussion in the Companion concerning p.17 of Lang; in particular, why the composite map G -> G/K -> (G/K)/(H/K) has kernel H, noting that it is difficult to pictures the "cosets of cosets" that comprise the latter group. Well, I don't consider it particularly helpful to regard elements of factor groups as cosets anyway. The important thing about a factor group G/H is that it is a group given with a homomorphism of G onto it which has kernel H. The construction by cosets is simply the tool used in showing that such a group exists. Let us write q_1 for the canonical map G -> G/K and q_2 for the canonical map G/K -> (G/K)/(H/K). Then the kernel of the composite q_2 q_1 is {x\in G | q_2 q_1(x) = e}. Since q_2 has kernel H/K, q_2 q_1(x) = e if and only if q_1(x)\in H/K; but q_1(x)\in H/K <=> x\in H. You also point out that in the next paragraph of the Companion, I refer to G/H, while Lang does not assume H normal. You're right; I hadn't realized that. Well, it is best to assume H normal when motivating the argument. Then, to get the boxed equation under Lang's more general assumption, apply the result so proved using N_H in place of G. Since by definition, H is normal in N_H, this works. You also ask why the kernel of the composite shown in the next-to-last display on p.17 of Lang is H. Well, write down a statement of what that kernel is, and compare with the definition of H a few lines earlier! (Remember that, as I said above, it is not important to look at G'/H' as consisting of coset; just regard it as a group with a homomorphism of G' onto it that has kernel H'.) ---------------------------------------------------------------------- You ask why, in the proof of Proposition I.3.1 on p.18, X is normal in G. Lang says a few lines earlier, "... it suffices to prove that if G is finite, abelian ...". So he is assuming G abelian; and in an abelian group every subgroup is normal. ---------------------------------------------------------------------- You ask about the embedding of H_i/H_(i+1) in G_i/G_(i+1) near the bottom of p.19. This is a case of the second boxed isomorphism on p.17. Figure out what the "H" and "K" of that isomorphism have to be in order to make the left-hand side come out to H_i/H_(i+1), then see what the right-hand side comes to for those values of "H" and "K", and you'll see that it's a subgroup of G_i/G_(i+1); so the former embeds in the latter. Also check out the discussion I have in the Companion of the intuitive meaning of that boxed isomorphism, and see how it applies in this case. ---------------------------------------------------------------------- You ask what Lang means, in on p.20, line 5, by "factors through the .. group G/G^c." He means that the given homomorphism f: G --> G' can be written as a composite, G --> G/G^c --> G', where the map G --> G/G^c is the canonical map from G to G/G^c. (I hope my discussion in class of the construction G/G^c made it clear why this is true.) If we call that canonical map q, and the second map above a, then we have f = aq, a "factorization" of f. (The fact that a group G/N, and in particular, G/G^c, is called a "factor group" of G is a coincidence. The "factoring" of the map refers to writing it as a composite, not to one of the groups involved being a factor-group.) ---------------------------------------------------------------------- In connection with the concept of simple group (p.20) you ask whether there exist infinite simple groups. Definitely, though it takes some work to verify the examples. To describe the easiest one, consider the group GL(n,K) of invertible n x n matrices over a field K (n > 1). This itself is not simple, since the determinant function is a homomorphism to the multiplicative group of K, and its kernel is a proper nontrivial normal subgroup. So let us consider that kernel, a group called SL(n,K). Even this may not be simple, because if K has a nontrivial nth root of unity zeta, then the corresponding scalar matrix zeta I has determinant 1, hence lies in this group, but is clearly central, hence generates a normal subgroup. So one divides out by the subgroup of all such elements, getting a factor-group called PSL(n,K). This is a simple group in almost all cases (the only exceptions are when n = 2 and K = Z/2Z or Z/3Z), but it takes some tedious linear algebra to prove this. (So it isn't really appropriate to give this example before we've covered linear algebra.) There are also examples obtained by presentations by generators and relations; but these can't be discussed until we have come to that technique. Oh, I guess there is an example that is easy to give at this point: I assume you have seen the fact that the alternating groups A_n are simple for n >_ 5, which we will prove soon. Well, one can define the "infinite alternating group" as the set of permutations sigma of the set N of natural numbers such that (i) sigma(i) = i for almost all i, and (ii) if we let k be an integer such that all the i with sigma(i) not-= i are _< k, then the restriction of sigma to {1,...,k} is an even permutation. Then one can verify that these elements form a group, and deduce from the fact that the groups A_n are simple for large n that this group is also simple. But this is a little less satisfying than the previous examples, because it arises from finite simple groups, while those examples are "essentially" infinite. ---------------------------------------------------------------------- You asked in your question for Monday about the meanings of "exponent" referring to a group: "an exponent" as defined by Lang on p.23 vs. "the exponent" meaning the least such value. Both are used; they are important in different contexts. If one wants to make a statement about all groups that satisfy x^6 = e, one wants to be able to call these "the groups of exponent 6" and not exclude those that in fact have exponent 2 or 3 (or the trivial group with exponent 1) -- the consequences of the equation x^6 = 3 remain true on those cases. But if one is studying a particular group, it is natural to look at the least exponent, and call it "the exponent of G". So the wordings "groups of exponent n" vs. "the exponent of G" carry this distinction. When there is real ambiguity, one should say which one means. ---------------------------------------------------------------------- You ask about the proof of Prop. I.4.3(iii) (p.24). The key idea lies in the paragraph preceding the proposition. In the context of (iii), each of a, b determines a homomorphism Z --> G; as in that paragraph, this induces an isomorphism between "G_1" and "G_2", in this case, G and G. If you look at the details of how this is constructed, you will see that it takes a to b. ---------------------------------------------------------------------- You asked about the isomorphism Z/mZ =~ G/f(mZ) in the second line of the proof of Prop. I.4.3, p.25. To see this, identify G with Z/nZ and f with the canonical map from Z to this factor group; then apply the first boxed isomorphism of p.17 with Z for "G", nZ for "K", and mZ for "H". ---------------------------------------------------------------------- Regarding Lang's Proposition I.4.3, proved on p.25, you ask > In the proof of (v), how does surjectivity follow from the Chinese > remainder theorem (which is a statement about ideals of a ring)? In the set of integers, the ideals are the same as the additive subgroups. (Anyway, the proof in the Companion gets around this use of a not-yet-proved result.) ---------------------------------------------------------------------- Regarding the note in the Companion about the top paragraph on p.26 of Lang, you ask: > Does it ever come in handy to know that every group is isomorphic > to a group of permutations? I have never been able to use that > to solve a problem. Well, if we didn't know this, it would certainly be of interest to study the properties of the class of groups that could be so represented, and figure out how they differ from other groups. Since the underlying significance of the group concept is that it describes the natural structure on the set of automorphism classes of a mathematical object, we would want to know "What properties of automorphisms have we missed in the definition of `group'?" It's also extremely useful heuristically -- if we want to find a group with a given property, we know that if we can find one, we can find such a group described as some group of permutations of a set; and that is often the best way to construct examples. ---------------------------------------------------------------------- You ask when the action of a group by conjugation on its subgroups is faithful, and whether it can be transitive (concepts introduced on p.28). Your e-mail parenthetically asks whether there is some connection between the kernel of that action and the center of the group, and you should have been able to see that there is: The center is contained in that kernel; so for the action to be faithful, the center must be trivial. I am not sure whether the converse is true. It seems very difficult for it to fail, since this would mean that every inner automorphism induced by a nonidentity element would have to move some elements, yet some such inner automorphism would have to carry all subgroups into themselves. There is a group in which certain inner automorphisms that move elements do carry all subgroups into themselves, namely the 8-element "quaternion group"; but these automorphisms modify all elements by members of the center of the group, and I don't see how to find an example that doesn't have a center. In summary: Having trivial center is a necessary condition for the action to be faithful, and might also be sufficient but I don't know. On whether it can be transitive -- it has to send every subgroup to a subgroup of the same order, so if one looks as the set of all subgroups, including G and {e}, it certainly can't be transitive unless G = {e}. ---------------------------------------------------------------------- You ask about the orbit decomposition formula (p.29). The basic idea is that if a set is the union of a family of disjoint subsets, then its cardinality is the sum of their cardinalities. To describe the cardinality of an orbit in S, one has to pick a point s of that orbit, and use Prop.I.5.1. On p.29, in the display preceding the orbit decomposition formula and the surrounding sentences, Lang sets up the notation for choosing a point from each orbit; the formula is expressed using that notation. I hope that with this in mind you can follow what he does. If not, write again. ---------------------------------------------------------------------- You ask whether the definition of pi(sigma) at the bottom of p.30 should be f(x_sigma^-1(1), . . .,x_sigma^-1(n)) rather than the formula Lang gives. That is a question that bothered me for a long time; but as implied in my comment on that formula in the Companion, it is correct. The difference between this and the case of d(sigma^-1( ), sigma^-1( )) (earlier on the same page of the Companion) is that in that case, the function sigma was being applied to each argument of d, while here sigma is being used to change the order of the arguments. Of course, if one thinks of f has having for "argument" an n-tuple of integers, then sigma is acting on the argument of f. But it is acting in a way that reverses order of composition; so two reversals combine to give a correct action. ---------------------------------------------------------------------- Your explanation for the argument at the top of p.31 seems to be right. In summary: Lang has proved that if a quotient of subgroups of the symmetric group is abelian, then the property of containing all 3-cycles carries over from the big subgroup (if it holds there) to the smaller one. Then he considers a tower in which all steps are abelian, and his top group, S_n, contains all the 3-cycles; so by the above statement, that property of containing all the 3-cycles carries over, inductively, to each member of the chain. So if one also assumes the chain ends in {e}, one has a contradiction. ---------------------------------------------------------------------- You write, in relation to my discussion of semidirect products (Companion, comment on Lang's p.33): > Suppose N is a normal subgroup of G. I wonder if in general, we > can write G as some product of N and G/N (direct or semidirect or > something more general). Well, as the example G = Z, N = nZ that I gave in class shows, G may not contain a copy of G/N, so if "some product" implies some group that contains G/N, the answer is no. But if one doesn't require that, then yes. I'll sketch the construction as a generalization of the semidirect product. Suppose we are given groups H and N, and an action \psi: H -> Aut(N). Then we define a group with underlying set H x N, and with multiplication of the form (h_1, x_1) (h_2, x_2) = (h_1 h_2, c(h_1,h_2) x_1^{\psi(h_2)} x_2) This is just like the definition of the semidirect product, except for the term c(h_1,h_2), which is given by a function c: HxH --> N which is required to satisfy ... exactly those identities needed to make the above operation associative! Those identities are called "the cocycle conditions", and c is called a "cocycle" for H, N, and \psi. I think this construction is used mainly in the case where N is abelian; otherwise the cocycle conditions are not nice enough to make practical use of. (But I'm not a group theorist, so I only have impressions of what they do.) As a very simple example, consider again the case where G is the additive group of Z, and N is the subgroup of multiples of a positive integer n. Thus, we want to construct Z from H = Z/nZ and N = nZ. Here \psi is trivial; to define c, let us write each element of H as [i], the congruence class of i, where i\in\{0,...,n-1\}. Then we define c([i],[j]) to be 0 if i+j < n, and to be n\in nZ if i+j \geq n. Then you will find that pairs ([i],nk) compose just the way the integers nk+i add. ---------------------------------------------------------------------- You ask about the isomorphism between right and left semidirect products (Companion, p.17, bottom, re Lang p.33). Well, remember that these are based on thinking of a group with a normal subgroup N and a subgroup H having trivial intersection with N, such that NH = G, and the two ways of uniquely writing any element of G, as nh or hn. A given element of G will have an expression in each of these forms; if you write down a formula relating one expression to another, and then use this as a formula for mapping a pair (n,h) to a pair (h,n), it will give the desired homomorphism. Remember also that the two semidirect products will involve homomorphisms "H --> Aut(N)" which have to be interpreted slightly differently, depending on whether one regards these as written on the right or on the left. To get the correspondence between the two constructions, one has to make precise the relation between these two versions of "Aut(N)". ---------------------------------------------------------------------- Sorry I didn't work Lemma I.6.1, p.33 into lecture; as usual, there is not enough time for everything I would like to say. Anyway, the proof of that Lemma consists of two parts: In the first sentence Lang states an auxiliary result which he will establish; the proof of that lasts until the display; when he has proved it, he uses it to prove the Lemma. That second part is clarified by my note in the Companion. It may seem at first that the auxiliary result is unrelated to the Lemma; but notice that each of them relates orders of elements to the order of the group. Of course, we know that the order of every element divides the order of the group; these results are partial converses -- they say, roughly, that the order of the group can't have prime factors that don't come from orders of elements. So it's not surprising that one of these statements yields the other. ---------------------------------------------------------------------- You ask about conditions under which, given a factorization of the order of a finite group G into relatively prime factors r and s, one can say that G must have a subgroup of order r -- as a possible generalization of the existence of p-Sylow subgroups (p.34). To say that r is a factor of the order of G such that the complementary factor s is relatively prime to r is equivalent to saying that for some set of primes \pi, r is the product of the largest powers of the members of \pi occurring in the order of G. In this situation, a subgroup of G having order r is called a "Hall \pi-subgroup of G". I know that people have studied the question of for which sets of primes \pi a group will have a Hall \pi-subgroup, but I don't know what general results have been found. For a negative example, if G = S_7 and \pi = {5,7}, then a Hall \pi-subgroup of G would have order 35. If one existed, this would mean that a group of order 35 had a faithful action on a set of 7 elements. But as Lang notes in the next-to-last example on p.36, every group of order 35 is abelian. It is not hard to show that a faithful action of an abelian group of order 35 must have at least one orbit of order divisible by 5 and least one of order divisible by 7. Whether these are the same orbit (in which case, its order must be divisible by 35) or different (in which case, their orders must add up to at least 12), this is clearly impossible in a set of 7 elements. (The same argument shows that S_5 has no Hall {3,5}-subgroup; I used the above case just because Lang had explicitly noted the statement about groups of order 35.) Anyway, knowing the name, you should be able to look for further references if you want. ---------------------------------------------------------------------- You ask how the statement about (H:H_s_i) being divisible by p completes the proof of Lemma I.6.3(a) on p.34. That statement about (H:H_s_i) concerns the summands in the displayed equation that correspond to non-fixed points. Hence, modulo p, we can drop all those summands, and conclude that #(S) is congruent modulo p to the sum of the terms corresponding to the fixed points. Each of those terms is (H:H) = 1 (since the isotropy subgroup of a fixed point is the whole group), so #(S) is congruent to the number of those summands. ---------------------------------------------------------------------- You ask about sentence beginning "Indeed" on p.35, just before the first display. Here Lang is using the observation numbered (iv) on p.17, about two groups such that one is contained in the normalizer of the other; in particular, the statement beginning "equally obviously". ("Obviously" meaning "it comes right out when you write down what is needed".) ---------------------------------------------------------------------- You ask how the equality H = Q at the end of the proof of Theorem I.6.4 (p.35) gives statement (ii) of the theorem. Because the proof of statement (i) actually gives Q as one of the conjugates of P. (See the first words of that final paragraph, "Next, let S be the set of all conjugates of P ...".) ---------------------------------------------------------------------- You ask why the G_i are normal in the tower of Corollary I.6.6, p.35. The Corollary is proved by induction, so the question comes down to seeing why normality is preserved by the inductive construction. (It is trivial in the base case, where n=0.) In the proof, the inductive step takes a tower for G/H and gets from it a tower for G. The inverse image of any normal subgroup under a homomorphism (in this case, the homomorphism q: G -> G/H) is normal, so that step preserves normality of these subgroups. The one subgroup that appears at this step that is not an inverse image under that homomorphism is the final step {e} (as a subgroup of H = q^{-1}({e}); and of course, {e} is always normal in G. It is interesting to compare this result with the observation "every finite group has a normal tower with simple factors", and see what difference accounts for the fact that that does _not_ give a tower in which each step is a normal subgroup. ---------------------------------------------------------------------- You ask about a name for a normal tower in which every step is in fact normal in G, as in Corollary I.6.6, p.35. I don't know of such a name. You mention "subnormal tower"; but if anything, I would expect that term to be used to refer to what Lang calls a "normal tower" by someone who wants to restrict the phrase "normal tower" to the case where all the groups are normal in G, since a "subnormal subgroup" means a subgroup which can occurs in what Lang calls a normal tower. Incidentally, what Lang calls a normal tower is more often called a normal series. Google shows "subnormal" used much less commonly before "tower" or "series" than "normal"; so it doesn't seem to be a common usage. ---------------------------------------------------------------------- You ask about the induction in the proof of Corollary I.6.6, p.35. When Lang says "by induction", a more detailed statement would be "We may assume inductively that the result is true for all groups of smaller order; in particular, for G/H". Then he takes the inverse image, in G, of the tower that the inductive assumption gives for G/H. ---------------------------------------------------------------------- In the proof of Lemma I.6.7, p.36, you ask about the phrase "the representation of G on this orbit". Think of "representation" as meaning "action". The idea is that one thinks of each member of the group as being "represented" by a certain permutation of the given set. The usage is very common when one speaks about groups acting by linear automorphisms of a vector space -- the study of this concept, in various forms, is called the theory of group representations. But it is also sometimes applied to actions by arbitrary permutations on sets. ---------------------------------------------------------------------- You ask about the case K = H in the proof of Lemma I.6.7, p.36. K = H is what he is proving (by getting a contradiction in the contrary case). Do you see how it gives the conclusion of the lemma? ---------------------------------------------------------------------- You ask what I mean in the 5th from last line of the paragraph after You ask about Lang's use in the first Example on p.36 of the fact that the automorphism group of a cyclic group of order 7 is a cyclic group of order 6. It's true that he should have given some justification; but what is really needed, namely that the automorphism group has order 6, follows easily from what he proved earlier. See the Companion, p.19, 8th through 5th lines from bottom, sentence beginning "Now by ...". (Once one has the facts noted there about that group, an easy computation for q = 7 shows that it is cyclic, as he claims, though this is not needed.) ---------------------------------------------------------------------- Lemma I.6.c3 in the Companion (in the section on structures of groups of order pq, to go with p.36 of Lang) by mapping hn to "the automorphism it induces". I mean the automorphism given by conjugation by hn, restricted to N. ---------------------------------------------------------------------- You say that Lang's proof of his Lemma I.6.7 (p.36) doesn't seem to give the statement "N < K" asserted in my Lemma I.6.c2(a). You're right! My mistake was that I took for granted that the proof that Lang would give of his lemma was the "natural" one; but he gives a roundabout one, and that proof doesn't yield my assertion. The "natural" proof of Lang's lemma is to look at the action of G on G/K, rather than on the set of conjugates of K. Then (whether or not K is normal) we see that the isotropy subgroup of K is K, so the kernel of the action is a normal subgroup contained in K. (To see how Lang's proof of Lemma I.6.7 can be completed from this start, note that _if_ (G:K) = p = smallest prime dividing (G:1), then (G:N), a divisor of p!, must equal p = (G:K), so K = N, so K is normal.) Thanks for pointing this out! ---------------------------------------------------------------------- Regarding direct sums of abelian groups (pp.36-37) you ask whether it isn't possible to define them for non-abelian groups as well. Well, there are two ways of thinking about a direct sum: As the group having a certain universal property, and as the subgroup of the direct product consisting of elements all but finitely many of whose coordinates are the identity. Groups of each of these sorts can be constructed in the non-abelian context, but, in contrast to the situation for abelian groups, they are two very different groups. The one with the universal property like that of the direct sum of abelian groups is called the coproduct of the groups; we will see that construction in this coming Monday's reading. The subgroup of the direct product consisting of elements all but finitely many of whose coordinates are the identity is called the "restricted direct product". ---------------------------------------------------------------------- You ask about the symbol "f|B" in the proof of Theorem I.7.1, p.41. It means the restriction of f to B; i.e., the function which has domain B instead of A, but on elements of that domain acts exactly as f does. (Lang notes this notation on p.ix.) ---------------------------------------------------------------------- You ask (in connection with the "Remarks on split surjections, split injections, and split exact sequences" following the discussion in the Companion about the proof of Lemma I.7.2, p.41 of Lang), what I mean by the short exact sequences "corresponding to" a given surjective or injective map of abelian groups. If f: A --> B is a surjective homomorphism of abelian groups, then 0 --> Ker(f) --> A --> B --> 0 is the corresponding short exact sequence. Likewise, if g: C --> A is injective, the corresponding short exact sequence is 0 --> C --> A --> A/f(C) --> 0. ---------------------------------------------------------------------- You ask about the argument at the top of p.42 that Lang uses to show that any two bases of a free abelian group have the same cardinality; in particular, you ask why B/pB is a direct sum of m copies of a cyclic group of order p. The general observation one needs is that if an abelian group A is written as a direct sum B (+) C, and if n is an integer, then A/nA (where nA denotes {nx | x\in A}), can be identified with B/nB (+) C/nC. Indeed, we get nA = nB + nC, and the isomorphism (B (+) C) / (nB (+) nC) =~ B/nB (+) C/nC is not hard to verify. The same is true for direct sums of more than two summands. Now if A is a free abelian group of rank m, it is a direct sum of m copies of Z, so A/pA is a direct sum of m copies of Z/pZ, which is Lang's statement. ---------------------------------------------------------------------- You ask why, on p.43 line 5, we have urx \in A_s and vsx \in A_r; i.e., why surx = rvsx = 0. Because by assumption, x\in A_m, and m = rs; and both of the coefficients sur and rvs are divisible by rs. ---------------------------------------------------------------------- Regarding Lang's reference to the "residue class" \bar{x} of x in the middle of p.44, you ask whether this just means the coset it belongs to. Right. The term "coset" is the more common one in group theory, while "residue class" is more common in ring theory. (Different traditions.) But Lang happens to use the latter term here in a group-theoretic context. ---------------------------------------------------------------------- You ask about the choice of the symbols bold Z versus ordinary (actually, italic) Z, as on p.46, bottom, in writing "Z_p" etc.. Well, boldface Z (or in recent decades, blackboard-bold Z) has become standard for the integers, whether as a set, a group, or a ring. (Historically, it comes from the initial letter of German "Zahl", meaning "number".) So one can regard "Z_p" as short for "Z/pZ", and so use boldface Z. But there are two difficulties. First, number-theorists like to write Z_p for the ring of p-adic integers (cf. Lang, last 3 lines of p.50 and first two lines of p.51, not in this course's readings), which has a very different structure from Z/pZ. (It is torsion-free, and uncountable.) Second, the use of "Z" for cyclic groups actually comes from the German "zyklisch" meaning "cyclic"; and it is convenient to use it even in the case where the group is written additively rather than multiplicatively, and/or with a generator denoted by some symbol other than "1". So these considerations lead one to use non-boldface Z, and I find myself pulled both ways. Even if one is inclined to use non-bold Z for finite cyclic groups, it is very natural to use Z for the infinite cyclic group when one identifies it with the additive group of the integers. ---------------------------------------------------------------------- You ask whether the "\psi" on p.49, line 3 means \psi_{x'}. Not quite: To make sense of what follows, one must understand \psi to mean the map that takes each coset [x']\in A'/B' to \psi_{x'}. This is well-defined by the preceding two lines; and since for each x', he has made \psi_{x'} a homomorphism A/B --> C, the map \psi so constructed will be a homomorphism A'/B' --> Hom(A/B, C). Thanks for pointing out this unexplained notation; I'll put a note in the next version of the Companion. ---------------------------------------------------------------------- You ask about the notation "0 --> A'/B' --> Hom(A/B, C)" in the second display on p.49. Lang means "a one-to-one homomorphism A'/B' --> Hom(A/B, C)". He is using the notation of exact sequences, introduced on p.15. If one writes a sequence "0 --> X --> Y" of abelian groups, then since a homomorphism from 0 is uniquely determined, it is probably not being written there to focus attention on that map; rather, if one states that the sequence is exact, this means that the kernel of the map X --> Y is the image of the map 0 --> X; in other words, that the kernel of the map X --> Y is zero, in other words, that the map X --> Y is one-to-one. Since Lang has not been talking about exact sequences, it is sloppy of him to use this notation here to express one-oneness; but it is an easy habit to get into when one works in areas in which exact sequences are commonly used. Likewise, writing X --> Y --> 0 indicates a surjective map X --> Y. ---------------------------------------------------------------------- You ask whether in the definition of a category (p.53) Mor(A,B) can be empty. Yes. It is never empty when the category is that of groups or monoids, because for any two groups or monoids A, B there is always the trivial morphism taking all elements of A to the identity in B. In the category of sets, Mor(A,B) is empty if and only if B is the empty set and A isn't. Soon, when we study rings, which we will require to have identity element 1 and homomorphisms carrying 1 to 1, we shall see that Mor(A,B) is empty in many more cases, e.g., when A is the ring Z_n and B = Z. ---------------------------------------------------------------------- Concerning Lang's statement on the 3rd and 4th lines from the bottom of p.53, that most of our morphisms are actually mappings or closely related to mappings, you ask what the distinction is between a morphism and a mapping, and for examples where morphisms are not mappings, in particular where they are "closely related to mappings". By a mapping, Lang means a function. The morphisms in a category, on the other hand, are simply whatever elements form the sets Mor(A,B). For an example where morphisms have nothing to do with set-maps, let \Gamma be any graph (I hope you've seen the concept; basically, a diagram consisting of some dots called "vertices", and "edges" connecting some of the vertices.) Define the category C_\Gamma to have for objects the vertices of \Gamma, and for two such vertices x and y, let a morphism x -> y mean a "path" from x to y in \Gamma, i.e., a sequence of consecutive edges starting and x and ending at y. For each x, we consider an "empty sequence of edges" to form a path from x to x, which we call the identity morphism of x. One defines the composite of a path f from x to y with a path g from y to z to be the path from x to z gotten by laying f and g end-to-end. One finds that C_\Gamma satisfies the axioms of a category; but the morphisms are not in any sense mappings from one set to another. For an example where they are "close to mappings", see the paragraph about the category "Rel" beginning near the bottom of p.33 of the Companion. If one looks at relations as "multi-valued functions" (where for each x\in X, the elements y\in Y such that (x,y)\in R are considered "the values of R at x"), then these relations are conceptually "close to mappings". For more examples, including some where the morphisms are "closer" to maps than in that one, see my Math 245 notes, section 6.2. ---------------------------------------------------------------------- Regarding the analogy between the concept of an abstract category and that of an abstract group, noted in the material in the Companion referring to Lang's p.53, you ask whether there is an analog of Cayley's Theorem for categories. The answer is "Yes, but ... ". If one assumes a set-theory such as I sketch on p.34 of the Companion, then any category C which is "small" with respect to a given universe will have a concretization -- a faithful functor to the category of sets -- in that universe. But note that a category such as the category of groups in a given universe is not itself small with respect to that universe; and for a general category which, like that one, merely has objects and morphism-sets all lying in a given universe, one can't necessarily find a concretization by sets in that universe -- though one can by sets in any larger universe. Sorry if this sounds confusing. The right way to come to it is by first figuring out how the proof should work, then noting how the sets that one constructs are related to the set-theoretic properties of the original category. In my Math 245 notes, I sketch the idea of the proof starting at the bottom of p.151 (ignore the blank p.152 if you're looking at it online, and continue on p.153), before I have introduced "universes". Then section 6.4 introduces universes, section 6.5 defines "functor", and Theorem 6.5.6 gives Cayley's Theorem, as the statement that every small category (category which is a member of one's chosen universe) admits a concretization (a faithful functor into the category of sets in that universe). ---------------------------------------------------------------------- You ask what the difficulty is with categories having proper classes of objects, alluded to at the bottom of p.31 of the Companion (re Lang, p.54). Well, to start with, although Lang introduces the concept of monoid (from which he gets that of group) as a set "with" a law of composition, and likewise now says that a category "consists of a collection of objects ... and for two objects ... a set Mor(A,B) ...", the the way to make such things precise is to define a monoid as an ordered pair consisting of the set and the operation (or better, an ordered 3-tuple whose third member is the identity element, but I won't go into the reason here), subject to the appropriate conditions, and similarly to define a category A as (at least) an ordered 3-tuple, with first member Ob(A), second member (Mor(X,Y))_X,Y\in Ob(A)) and third member specifying the composition operation. But if a tuple is defined as a certain sort of function, one can't define a tuple whose entries are proper classes. Given Lang's definition of category, you say the only reason you can see for wanting categories to be sets "is if you want to perform weird things like form categories of categories". Well, one does want to look at such things; but without trying to convince you of this, I can surely point out that one might want to use a set which contains two or three or countably many categories. If you look at exercises I.12:3-I.12:4, you will see the concept of a "variety of groups". It is not hard to show that the set of all varieties of groups "forms a lattice" -- except that it isn't a set. Varieties of groups can be studied even without category theory, so the need to get around this problem is not just a consequence of the category-theoretic viewpoint. Anyway, the solution is very elegant; I recommend section 6.4 of the Math 245 notes. An amusing feature of the situation is the reason I indicate there for proposing the axiom that every set is contained in a universe, rather than the weaker axiom that there is at least one universe: The latter approach tacitly creates one realm (the inside of the universe) within which "ordinary mathematics" is done, and a grander realm in which categorists work; the former creates a situation in which any sort of mathematics can be considered to be done in any universe, and any consideration that looks at it globally can be done in the next larger universe. So the justification of the stronger axiom is to avoid setting up a mathematics in which categorists would have an "elite" role! ---------------------------------------------------------------------- You ask what the morphisms are in the category used in the definition of the Grothendieck group (p.58). If f is a homomorphism M --> A, and g a homomorphism M --> B, then a morphism from f to g in that cagetory is a homomorphism h: A -> B that makes a commuting diagram with f and g, i.e., such that g = hf. The way to see that this is what Lang intends is to read carefully the preceding paragraph, where he states precisely what he means by a morphism in the category of abelian groups with set-maps of S into them. Then it is reasonably safe to assume that if he doesn't say what he means the morphisms to be in this new category, it is because the situation is completely analogous. Conceptually, I recommend describing this auxiliary category as having for objects "abelian groups with homomorphisms of the monoid M into them". I.e., where Lang calls the homomorphism f: M --> A the object, I would call the pair (A, f) the object. This makes the definition of a morphism (A,f) --> (B,g) more intuitively natural -- it is a homomorphism between these groups which respect the "additional structure" on the groups, namely, the maps of M into them. ---------------------------------------------------------------------- Concerning the fact that Lang says near the bottom of p.61 that in the fiber product shown earlier on that page, one calls p_1 the "pullback" of g by f, you ask in what way p_1 is related to g. I suggest you think about the form that the fiber product diagram takes when C is the category of sets -- this is noted in the comment on that page in the Companion -- and verify for yourself that in that case, if g is 1-1 then so is p_1, and that if g is surjective then so is p_1. These two facts don't themselves prove that p_1 is naturally associated to g; but I think that the understanding that proving those two facts will give you should convince you that this is so. ---------------------------------------------------------------------- You ask about Lang's use of the word "rule" in defining "functor" on p.62. For "rule" read "function". I think he is avoiding the word "function" because a function is supposed to have a _set_ as domain, and he is defining a category to be a "collection" which may be too big to form a set. The discussion in the Companion starting near the bottom of p.31 tells how to get around that dilemma; thus functors can indeed be considered to consist of functions (one function on objects and one one morphisms). ---------------------------------------------------------------------- (I guess this question was suggested by the term "natural transformation" on p.65 of Lang.) > I'm wondering if the terms "canonical" and "natural" have some > well-defined meaning in category theory, e.g. a construction is > canonical if all other similar functors factor through its functor. The best definition I can give of "canonical" is the one in the comment in the Companion to p.14 of Lang: "A canonical object means one determined in a special way by the data being considered." It would be a pity if someone gave the term a technical meaning, even if that meaning matched the above sense in a large class of cases, because we need words for the "meta-discussion" of mathematics, and as more and more of these words are given technical meanings (e.g., "simple", "natural" etc.) it becomes harder and harder to talk about mathematical concepts, as distinct from making formal mathematical statements. As I said in class, "natural" came to be used to mean "satisfying the conditions for a morphism of functors", and many people now use the word "natural transformation" to mean morphism of functors. But for the reason indicated above, I strongly favor just calling a morphism of functors a morphism of functors, and I'm glad Lang does so. ---------------------------------------------------------------------- Regarding Lang's introduction to category theory (pp.53-65), you ask > What is category theory actually useful for, beyond adding notation > which is uniform across mathematics? ... in section 11, Lang doesn't > prove any big theorems. Are they simply coming later? ... We have seen one general result: Though Lang gave the statement that "universal repelling" and "universal attracting" objects are unique up to unique isomorphism in passing, it is actually a powerful tool -- easy to see in the abstract, but not necessarily in a specific situation where the conditions for which an object is universal are complicated. And I mentioned today that the construction of free groups (and the same applies to coproducts) as subobjects of big direct products worked in a very general category-theoretic context. The choice Lang makes is to introduce just some basic concepts of category-theory, and present them as a language in which to formulate some specific results about groups. Getting into the technicalities needed to present general category-theoretic results would take us too far afield from the main topics of his text, and math 250A. In my 245 notes, I make a somewhat different choice: I develop a large number of examples of universal constructions in algebra emphasizing the parallelisms but not introducing the concept of category, and then, with all these on hand for motivation, I define category, and develop the theory that unifies these results and provides common proofs for the results shown separately. Some of my exercises in the Companion for section I.11 of Lang do obtain general category-theoretic results, though not real biggies: I.11:1(d), I.11:4(b), and I.11:5. ---------------------------------------------------------------------- You ask whether the description of the free abelian group on M as a group of maps from M to the integers has an analog for the free group on M (p.66). Not an obvious one. In the case of the free abelian group, we were able to associate to different elements of M functions with disjoint supports (where the "support" of a function means the set of points where it is nonzero). Group-valued functions with disjoint supports clearly commute with each other (because evaluating two such functions at any point, one of them is 1, and 1 commutes with any other element); hence using such functions, we cannot generate any noncommutative group. Thus any noncommutative analog of the description of the free abelian group cannot have the property that the function associated to each x\in M has support {x}. If we abandon that assumption, then it is no longer natural to identify the domains of our functions with M; so we should look for a description a free group on M as a group of functions on _some_ set X, with values in some not-necessarily-commutative groups. This is essentially Lang's proof of Lemma I.12.2, the set being {(i,phi)} and the groups being G_{i,phi}. But in that form, it doesn't look much like the description of the free abelian group. ---------------------------------------------------------------------- Regarding the construction in the first display on p.67, you ask why we can't we let the index set I be "the set of isomorphism classes of all groups generated by S", and each G_i a representative of the class i. The problem is that these isomorphism classes don't form a set -- each one of them fails to be a set, because elements of a group can be taken to be arbitrary objects in our set theory; so that if we had a set of all groups of a given order, then the set of their identity elements (for instance) would be the set of all sets, which we know leads to paradoxes. Of course, ZFC doesn't say "you aren't allowed to perform a construction if one can show that it would lead to a paradox"; it simply avoids paradoxes by not giving us the tools to create a "set of all" anythings, except for things built from specific previous sets. So in this case, the equivalence classes you want are not sets, and we can't talk about them as such. However, groups whose underlying set is contained in T do form a set, and we can construct them all. If we took one member of each isomorphism class within the set of groups with underlying set contained in T, given with maps of S into them which generated them, we would have what you wanted: a set of groups with maps of S into them such that every group generated by the image of a map of S into it was isomorphic to one and only one of them. But since to prove that this exists, we have to start with the whole set of groups with underlying set contained in T, it would just add to the length of our proof to pare this set down by throwing away all but one member of each isomorphism class. ---------------------------------------------------------------------- You ask how to prove Lang's assertion that the group presented as in the second display on p.69, . is trivial. Let's re-write the three relations defining this group in several ways: (1a) xyx^{-1}y^{-1} = y (1b) xy = y^2 x (1c) xyx^{-1} = y^2 (1d) yx^{-1} = x^{-1}y^2 (2a) yzy^{-1}z^{-1} = z (2b) yz = z^2 y (3a) zxz^{-1}x^{-1} = x (3b) zx = x^2 z (3c) xzx^{-1} = x^{-1}z (3d) zx^{-1} = x^{-2}y^2 Here (3c) is gotten by applying x^{-1}(...)x^{-1} to both sides of (3b), and interchanging the two sides. The point of (1c) and (3c) is to express the relations in question as describing the action of conjugation by x on the two other generators; the point of (1d) and (3d) is to express these same relations as rules for moving "x^{-1}" to the left past y or z, modifying the latter appropriately. If we now conjugate (2b) by x, then by (1c) and (3c) the result is (4) (y^2)(x^{-1}z) = (x^{-1}z)^2 y^2. We now apply (1d) and (3d) to bring all occurrences of x^{-1} to the left on each side of (4), getting (5) x^{-1} y^4 z = x^{-3} z^2 y. Applying (2b) (in reverse) to the right-hand side of (5), and then applying x^3(...)(yz)^{-1} to both sides, we get (6) x^2 y^3 = e. Now we can conjugate (6) by x using (1c), to get (7) x^2 y^6 = e. Comparing (6) and (7) we get (8) y^3 = e, hence (6) shows x^2 = e, hence (3b) gives zx = z, or (9) x = e. One can now get y = e and z = e either by calling on the symmetry of the original system of equations, or by substituting (9) into (1a), and substituting the resulting equation y=e into (2a). - - - Incidentally, it is known that if one goes one step further, the presentation G = . gives a very nontrivial group. The idea is to first look at the groups G_1 = , G_2 = , and show that in each of them, the subgroup generated by x and z is free on those two generators. The group G is then what Lang would call the "fibered coproduct of G_1 and G_2 over the free group on {x,z}", which group theorists would call the "free product of G_1 and G_2 with amalgamation of" that common subgroup, and still others would call the "pushout of the diagram formed by" those three groups. Anyway, the general structure theorem for "free products with amalgamation" shows that G_1 and G_2 are both embedded in G. On the other hand, any homomorphic image of G in which one of x,y,z,w has finite order is the trivial group: Given, say, an equation x^n=e, one deduces successively equations of that form for y, z, and w with different exponents, and finally, another such equation for x with a different exponent, which together with the given equation implies x=e. (By the same principle, as soon as I got (9) above, I knew I was "home free".) So this is an example of a finitely generated nontrivial group with no finite nontrivial homomorphic images. ---------------------------------------------------------------------- You ask what Lang means on p.70, two lines above first display, by "the free group with generators u(b), w and relations SL1 - SL4". He means the group presented by those generators and relations. Since the idea of "free" is "not satisfying any relations other than those that have to be satisfied", it is sometimes colloquially used to describe any universal algebraic object. However, since it has been given the specific meaning "not satisfying any relations other than those implied by the identities", and one has the distinct term "presented by the indicated generators and relations" for what Lang means here, he ought to use that term. ---------------------------------------------------------------------- You ask, regarding Proposition I.12.3, p.70, whether there is a general schema to show that coproducts exist in a given category. Well, categories are not all alike, and coproducts don't exist in all of them! But in a large class of "naturally occurring" categories, exactly the method Lang uses here allows one to prove that any family (X_i)_{i\in I} of objects has a coproduct: One finds a set of pairs (object, family of maps of the X_i into it) which is general enough to approximate "all" such pairs; takes the direct product of these objects, and uses "the subobject generated by the images of the given maps". Of course, a general category doesn't have such concepts as "subobject", but in the naturally occurring categories for which this method works (groups, monoids, rings, etc.) one does. On the other hand, the description of the _structure_ of the coproduct of a family in Proposition I.12.4-5 is special to groups. ---------------------------------------------------------------------- You ask about the cardinality estimate in the middle of p.71; specifically, why a countable union of sets each having the same cardinality as S also has the same cardinality as S. By Theorem A2.3.3, p.887. (Taking the direct product with D is equivalent to taking the union of countably many copies.) ---------------------------------------------------------------------- You ask about the statement on p.71, below the middle display, "we may assume without loss of generality that G = S_gamma for some gamma, and that g = psi for some psi \in Phi_gamma". Whenever an author says "We can assume without loss of generality that X is the case", this means "If we know the result true when X is the case, then we can prove the general case from this." So the present situation, assume provisionally that whenever G = S_gamma and g = psi \in Phi_gamma, there exists g* as in the preceding display. To prove the general case from this, remember that Gamma and the sets Phi_gamma were constructed to give, up to isomorphism, "all" groups of order card(S) and all families of homomorphisms of the G_i into them. So given an arbitrary G with card(G) = card(S), we can a find an isomorphism alpha from G to some S_gamma, and by composing alpha with the family of maps g, we must get some phi\in Phi_gamma. Now the assumed result for S_gamma and phi gives us a map g_*: F_0 --> S_gamma with the indicated property; composing this with alpha^-1, we get the desired map F_0 --> G. ---------------------------------------------------------------------- > On page 71, Lang mentions in the example that G_2\coprod G_3 is the > group generated by two elements S,T with relations S^2=1,(ST)^3=1. > How does one show this? I guess by using the "several properties of S, T" proved in the book quoted on p.72 (sentence after second display). I'll add to future versions of the Companion a note to skip this example (or at least to realize that one does not have enough information to prove the assertions). A method of proving that certain elements of SL(2,Z) generate free subgroups is indicated in Exercise 2.4:5 (p.34) of my Math 245 notes. But it doesn't go as far as getting the structure of the whole group. ---------------------------------------------------------------------- Regarding the assumption that rings must have unit (p.83), and my comment in the Companion (p.43) that there is a trick which reduces the study of rings without unit to that of rings with unit > I'm curious about the trick for studying rings without units. > ... The reason I ask is that in manifold theory, the ring of > C^\infty functions with compact support ... has no unit ... > ... Is the idea somehow to just think of rings without units > as ideals of larger rings with unit? In a way, yes. Given any nonunital ring A, one can make the abelian group A' = Z (+) A into a ring, by taking the element 1\in Z to behave as the unit, using the given multiplication on A to define the product of two elements of A, and using distributivity to and two facts to get a multiplication on all of A', namely (m+a)(n+b) = mn + (na + mb + ab). In this situation, we see that A will form an ideal of A'. If one's rings are algebras over a field, as in the situation you referred to, it is more natural to use that field instead of Z. So if A is the ring of C^\infty functions with compact support, then A' will be the ring of C^\infty functions which are constant off a compact set. (This needs a bit of qualification in the case where the manifold is itself compact.) For an example of the nice properties of this construction, note that if a is an element of a nonunital ring A, the left ideal of A generated by a does not have the form Aa, but Za + aA. But if we regard A as lying within A', this is simply A' a. The ring A' constructed above has a natural homomorphism to Z, given by m + a |-> m. In fact, the category of unital rings "over Z" in the sense of Lang, p.61, is essentially the same as the category of nonunital rings; and that is what I feel justifies regarding the study of nonunital rings as subsumed by that of unital rings. (Again, the same applies to algebras, mutatis mutandus.) ---------------------------------------------------------------------- In connection with the definition of ring (p.83) you asked whether there were cases where one might want to consider "rings with a non-commutative addition operation". If one left commutativity of addition out of the definition, it would still follow from the distributive laws. Namely, for any two elements a and b we can simplify (a+b)(1+1) in two ways: by using left distributivity and then right distributivity, or the other way around. Equating the results gives a+b+a+b = a+a+b+b. Cancelling the common term a on the left and the common term b on the right, we get b+a = a+b. There are variants of the concept of ring where this argument won't prove complete commutativity. E.g., if the ring doesn't have unit, a similar computation will prove that any two elements that are products commute with each other under addition, but not arbitrary elements. And if one only assumes distributivity on one side, one can have more widespread noncommutativity. There are concepts such as "near-ring" and "half-ring" embodying such weakened assumptions; they are studied by a small number of people, of which I am not one. ---------------------------------------------------------------------- Regarding the definition of principal ideal ring, p.86, > Are there examples of principal ideal rings which are not > principal ideal domains? Yes. It is easy to check that any homomorphic image of a principal ideal ring is a principal ideal ring. Hence for any n, Z/nZ, being a homomorphic image of Z, is a principal ideal ring; but for n > 0 not prime, it will not be an integral domain. ---------------------------------------------------------------------- On p.87, > Lang says "If a, b are left ideals of A, then a+b (the sum being taken > as additive subgroup of A) is obviously a left ideal." What does he > mean by "the sum being taken as additive subgroup of A" ? He means {r+s | r\in a, s\in b}. Cf. last paragraph on p.37. Trying to answer this question sent me looking to see where Lang first introduces that notation. Such notation is introduced in _multiplicative_ form on p.6, in the paragraph before the middle of the page, and uses it commonly in sections I.2-I.3. As far as I can see, he first uses it in additive form in the on p.37 at the point mentioned above, taking for granted the transition from multiplicative to additive notation. ---------------------------------------------------------------------- You asked whether homomorphisms whose kernels are prime ideals (p.92) in some way "generate" all homomorphisms. Basically, the answer is "no", but there are certain positive statements one can make. As you know, the monoid of all ideals of the ring of integers is generated under multiplication by the set of prime ideals. More generally, this is true in all principal ideal domains. The two questions one can ask are "Does this fact lead to corresponding statements about _homomorphisms_ having these ideals as kernels?" and "Does this fact generalize to (some or all) rings that are not principal ideal domains?" To the former question, the answer is "Yes in the context of module homomorphisms, though not in the context of ring homomorphisms"; to the latter, the answer is "Yes, with a weakened conclusion, for certain classes of rings." If you want more details, you can ask me in office hours. ---------------------------------------------------------------------- You ask why the first line on p.93 shows that y\in \frak m. (I'm using "\frak" as in TeX notation for "fraktur font".) Interestingly, this is the same point that stumps many 113 students in the proof of Euclid's Lemma, which is really a special case of this result. To see why the right-hand side is in \frak m, look at the preceding steps of this proof and see what elements we already know to be in \frak m, with special attention to terms similar to those in this expression. By assumption xy\in \frak m, and behold, there is an xy dividing one summand in the right-hand side of the line in question; so you just have to look at the other summand, yu. And the preceding line of the proof says "u \in \frak m", so that's in \frak m too. So the sum in in \frak m. ---------------------------------------------------------------------- You ask about the statement on p.94 that "given an integer n > 1, the units in the ring Z/nZ consist of those residue classes mod nZ which are represented by integers m \neq 0 and prime to n." One way to think of this is that for such an m, multiplication by m doesn't take any number not divisible by n to any number that is divisible by m. This is clear if we know unique factorization of integers; and it follows that multiplication by m will be 1-1 on Z/nZ, hence since that set is finite, it will be invertible, so the residue class of m is a unit. If we don't assume familiarity with the unique factorization property of the integers (which will be deduced in a later section from general results about certain kinds of rings), we can, as you say, use the Euclidean algorithm to get get um + vn = 1, so um is congruent to 1 mod n, so the residue of m is a unit. ---------------------------------------------------------------------- You ask about the statement on p.94 that "given an integer n > 1, the units in the ring Z/nZ consist of those residue classes mod nZ which are represented by integers m \neq 0 and prime to n." One way to think of this is that for such an m, multiplication by m doesn't take any number not divisible by n to any number that is divisible by m. This is clear if we know unique factorization of integers; and it follows that multiplication by m will be 1-1 on Z/nZ, hence since Z/nZ is finite, it will be invertible, so the residue class of m is a unit. The proof of the unique factorization property of the integers (which will actually be given in a later section for a large class kinds of rings), is based on the Euclidean algorithm, and as you also mention, this can be used directly to get um + vn = 1, making um is congruent to 1 mod n, and so proving the residue of m a unit. ---------------------------------------------------------------------- You ask about the origin of the term "Chinese Remainder Theorem" (p.94). That result for the ring of integers was known to Chinese mathematicians several centuries back. My understanding is that they were concerned with chronological cycles, and whether various combinations of positions in such cycles would occur. In particular, Chinese culture has a 10-year cycle and a 12-year cycle; each point in each cycle has a name; the points in the 12-year cycle are associated with animals and give the "year of the dog" etc. that we hear about every Chinese New Years. I haven't heard of the items in the 10-year cycle having any such significance, but it is equally important in specifying years. Since 10 and 12 are not relatively prime, the version of the Chinese Remainder Theorem in Lang is not applicable to that case; but for any two ideals I and J of a ring R and elements x\in R/I, y\in R/J, one can show that there is an element of R belonging to the intersection of x and y if and only if x and y have the same images in R/(I+J). Thus, a member of Z/10Z and a member of Z/12Z can be realized by a common element of Z if and only if they have the same image in Z/2Z. Hence a point in the 10-year cycle and a point in the 12-year cycle will be reached together at some time if and only if either they both have odd positions in their cycles, or both have even positions. Probably the background of the theorem includes more complicated cycles as well. For instance, since they have a lunar month, they have to alternate between 12-month years and 13-month years, and the cycle of these takes 19 years. (The same applies to the traditional Jewish calendar.) But I don't know more details. ---------------------------------------------------------------------- You ask about Lang's statement on p.97, beginning of section II.3, that "there are polynomials over a finite field which cannot be identified with polynomial functions in that field." If p is any prime, then over the field k = Z/pZ (the field of p elements), the polynomial f(x) = x(x-1)(x-2)...(x-(p-1)) has the property that for all a \in k (i.e., for each of the values a = 0, 1, 2, ..., p-1), f(a) = 0. So the function gotten by evaluating f at elements of k is the zero function, although f and 0 are different polynomials. ---------------------------------------------------------------------- You ask whether left adjoints (Companion, p.52, one of the items to go after end of section on p.107 of Lang) are unique. Indeed they are -- up to isomorphism, of course. Lang observed that "universally repelling objects" are unique up to unique isomorphism, and that free groups etc. could be considered universally repelling objects in certain auxiliary categories. The same applies to the objects F(X) which fit together to form the left adjoint of any functor U that has one -- each of them corresponds to an initial (i.e., universally repelling) object in an appropriate auxiliary category, and so is unique up to a canonical isomorphism; and since the morphisms that join the objects F(X) into a functor F are also determined uniquely (by instances of the universal properties of the separate objects), the functor F (if it exists) is unique up to a canonical isomorphism. ---------------------------------------------------------------------- You ask why mathematicians consider localization of commutative rings (Lang, pp.107-111). It's hard to know how to answer such a question; for me the first answer is "Because it is interesting". But I'll give you a couple more examples of how localizations arise, that might make sense to you. First, consider a polynomial ring K[X] (K a field) as a subring of the ring K[[X]] of formal power series a_0 X^0 + ... + a_n X^n + ... (not defined or studied in 250A). Many polynomials that are not invertible in K[X] become invertible in K[[X]]; e.g., 1-X has the inverse 1+X+X^2+...+X^n+... . So one can consider within K[[X]] the subring generated by the elements f(x) g(x)^{-1} such that f(x) is a polynomial, and g(x) is a polynomial that is invertible as a formal power series. This will be the localization of K[X] at the set of polynomials of the latter sort, namely, the polynomials that do not belong to the ideal (X); i.e., it will be K[X]_{(X)}. Second, consider a ring Z/pZ (p a prime). Not only can every integer a be mapped to an element [a] of this ring; given any fraction a/b such that b is not divisible by p, we can map a/b to [a][b]^{-1}. (E.g., for p = 5, we can map 2/3 to [2] . [3]^{-1} = [2] . [2] = [4].) The set of rational numbers for which we can do this forms the localization of Z at (p), written Z_{(p)}, and the standard homomorphism Z --> Z/pZ extends to a homomorphism Z_{(p)} --> Z/pZ. ---------------------------------------------------------------------- Regarding the verification of the statement in Lang that the map h of the next-to-last display on p.109 is a homomorphism, you write "I don't see how I can justify commuting f(s)^(-1) with f(a')f(s')^(-1)". In this section, "ring" means "commutative ring" unless the contrary is stated; so in defining the category C at the top of the page, B is assumed commutative! However, though I don't know whether Lang thought about this, everything he says on this page remains true of the objects B, B' etc. are allowed to be noncommutative rings, as long as A remains commutative. The key fact is: | Lemma. If x and y are elements of a monoid M, and commute with | each other, and if y is invertible in M, then x also commutes | with y^{-1}. Proof. x y^{-1} = (y^{-1} y) x y^{-1} = y^{-1} (y x) y^{-1} = y^{-1} (x y) y^{-1} = y^{-1} x (y y^{-1}) = y^{-1} x. [] Note that in the above situation, if x is also invertible, then x^{-1} will similarly commute with y, and by a second application of the same principle, it will commute with y^{-1}. So the step you ask about can indeed be justified without assuming B commutative. ---------------------------------------------------------------------- You ask whether, on the second line of p.56 of the Companion (material on p.109 of Lang), "h(s^{-1})" should be "h(s^{-1} a)". Right. Thanks! ---------------------------------------------------------------------- Regarding the construction of localizing a commutative ring at a prime ideal (Lang, p.110) you ask > Can all local rings be thought of as arising from localizing using > the complement of a prime ideal? Well, if you take any local ring R, and localize it using the complement of its own maximal ideal, the result will again be R (since the complement of the maximal ideal of R is exactly the set of invertible elements). So every local ring arises in that way by localizing itself. Perhaps what you really meant was "Does every local ring that comes up naturally in algebra arise by localizing some naturally arising non-local ring using the complement of a prime ideal?" The answer is no. An example is the ring K[[X]] of formal power series a_0 X^0 + ... + a_n X^n + ... over a field K. This is not defined or studied in 250A; but see Lang, section IV.9. Another sort of example that doesn't arise by localization is the field Z/pZ, or the ring Z/p^n Z (p a prime, n a positive integer). ---------------------------------------------------------------------- Concerning the discussion of specializations of fields in the Companion (p.57, material concerning p.111 of Lang), you ask what I mean by "minimal domain" in the first sentence of the second paragraph. As noted in the first paragraph, a specialization \phi on a field E is a map which is not defined on all of E -- its domain is a certain subring E_\phi of E. So one can make the set of specializations from E to K a partially ordered set by writing \phi\leq\psi if E_\phi\subseteq E_\psi, and \phi is the restriction of \psi to E_\phi. Looking at all specializations E --> K whose domains contain a certain set, one can look at minimal members of this partially ordered set under the above ordering. ---------------------------------------------------------------------- > Lang defines left module on page 117. Shouldnt he include the > condition (ab)x=a(bx) as well? When he speaks of "an operation of A on M (viewing A as a multiplicative monoid)", this is understood to entail that equation; see p.26, line 5. (On p.26, G is restricted to be a group, so Lang is sloppy in taking for granted without having said it that the same conditions define operations of general monoids.) ---------------------------------------------------------------------- Many of your questions are based on the assumption that the word "algebra" does not presume associativity or the existence of unit. Lang's discussion of the topic on p.121 gives that impression, but as I say in the first sentence of p.64 of the Companion, "Lang's introduction to this concept is misleading"; and I make clear that the use of the word "algebra" does not imply absence of the associativity and unitality conditions. (One can consider, as we do in this course, associative unital rings; in other contexts one considers nonassociative and/or nonunital rings. The same applies to algebras.) And Lang himself, in the last sentence of the middle paragraph of p.121, cancels the impression given by what he says up to that point by writing "But in this book, unless otherwise specified, we shall assume that our algebras are associative and have a unit element." So -- if you want to know the answers to some questions about nonassociative and/or nonunital rings or algebras, you can ask them; but don't assume that that is what Lang or I mean in what we write about algebras. ---------------------------------------------------------------------- Regarding a comment in the Companion about p.129 of Lang, you ask > What is an overring? If R is a subring of S, then S is called an overring of R. ---------------------------------------------------------------------- You ask whether in Corollary III.4.3, p.135, the modules are over the same ring. Yes. Isomorphism of modules is only defined when they are over the same ring, so unless some wording is added to imply a nonstandard sense of "isomorphism", one can presume when "isomorphism" is mentioned that the modules are over the same ring. (Also, when modules or vector spaces are mentioned without specification of the ring or field, one can generally assume that there is some fixed ring or field in the background, which all modules or vector spaces mentioned are over.) ---------------------------------------------------------------------- You ask about my notes in the first long paragraph of p.72 of the Companion (in the material on Lang's p.138), about getting K_0. It follows from the characterization of K_0 in the last paragraph on p.138 that K_0(R) can be gotten from K(R) by dividing out by the subgroup generated by the elements representing the free modules. (It takes some thought to verify this formally, but it is not hard.) Is it clear now why the statements in the Companion follow. ---------------------------------------------------------------------- You ask about the statement on p.73, of the Companion, lines 9-10, in the notes to follow the section ending on p.139 of Lang, that "all but one condition" for preservation of short exact sequences had been proved in section III.2. The results I meant were Propositions III.2.1 and III.2.2, pp.131-132. For instance, in the case of Hom(X,-), if we restrict the result of Proposition III.2.2 to short exact sequences, i.e., put a "->0" at the right end of the given exact sequence, then the proposition as it stands shows that the sequence of hom-sets shown at the bottom of the page is exact -- but not that it would remain exact if "->0" were put at the end. And indeed, that is not true for general modules X. But that exactness condition is exactly the content of projectivity of X. Similarly, if Prop.III.2.1 is applied to short exact sequences, it fails to show one of the conditions that would be needed to preserve exactness, but that one condition is the content of the statement "Y is injective". ---------------------------------------------------------------------- Regarding the existence of bases for all vector spaces (p.139, Theorem III.5.1) you ask > ... Has anyone ever described an uncountable basis for a vector > space? ... Sure! Consider the space of real-valued functions on the real line with finite support (i.e., equal to zero except at finitely many points). It has a basis consisting of the elements that have value 1 at a single point, and 0 everywhere else. For a slightly less obvious example, consider the space of step functions on [0,1]; i.e., functions f such that you can divide [0,1] into finitely many intervals (let's say right-half-open, for concreteness, i.e, of the form [a,b) with 0\leq a < b <1, or of the form [a,1]) so that f is constant on each of these intervals. A basis for this space is given by the functions that are 0 on an interval [0,a) and 1 on the complementary interval [a,1] (a\in [0,1)). Still more challenging -- but more natural-seeming -- is the space of piecewise linear continuous functions on R or on [0,1]: functions that are of the form f(x) = ax + b on each of finitely many intervals into which one divides the domain, with values agreeing where two intervals meet (e.g., the function |x|). You might see whether you can find a basis for that. But the really natural cases -- trying to find a basis for R as a Q-vector-space, or for K^N as a K-vector-space where K is a field and N the set of natural numbers -- are probably impossible to do without using the Axiom of Choice, in the sense that it can probably be proved that the non-existence of such bases is consistent with Zermelo-Frankel set theory without that Axiom. A logician should know for sure whether this is so. > ... Or even an uncontable linearly independent subset of a vector > space? This one can do even for the examples just mentioned. To construct such an example in K^N, let F be the set of all finite subsets of N, and let f: N --> F be a bijection, say the map sending each nonnegative integer n to the set of i such that the binary expression for n has a 1 in the "2^i's column". Let P(N) be the set of _all_ subsets of N. We define a map x: P(N) -> K^N as follows: Given S\in P(N), for each n\in N we let x(S) have value 1 at i if f(i)\subset S, 0 otherwise. To show that the elements x(S) (S\in P(N)) are linearly independent, it will suffice to show that for any finite nonempty family of them, x(S_1), ..., x(S_n) with S_1, ..., S_n distinct, there exist m\in{1,...,n} and i\in N such that x(S_m) has value 1 at i, but all the other x(S_r) have value 0 there. But this is easy to show: Some S_m will not be contained in any of the others, and from this we can easily find a finite set T which is contained in S_m but not in any of the others. If we take i such that f(i) = T, we have the desired property. And here's a different sort of construction which I am fairly confident gives an uncountable set of real numbers linearly independent over Q, though I haven't thought out a detailed proof. For every real number p \geq 2, let f(p) be the real number which has decimal digit 1 in positions [p], [p^2], [p^3], ... , and 0 in all other positions, where for any real number x, the symbol [x] denotes the greatest integer \leq x. The set of these f(p) should have the desired properties. ---------------------------------------------------------------------- Regarding the discussion of "the functorial approach", pp.75-76 in the Companion (among the comments to be read the end of III.5, p.142) you ask about the reference to "Lemma III.5.7(a)". I meant Lemma III.5.c3(a) -- thanks for pointing this out! ---------------------------------------------------------------------- > On p.143 of Lang, after second display: > "with b_i\in K" should be "with b_i\in A" Thanks! ---------------------------------------------------------------------- You ask what Lang means by the equality = at the top of p.145. He is considering two bilinear maps: the given map V x V' --> K (p.144, middle), and the canonical map V x V^v --> K defined by evaluating members of V^v on members of V. He is denoting both of them by "< - , - >". So in the equation you ask about, the former map is meant on the left-hand side, and the latter on the right. ---------------------------------------------------------------------- You ask whether, for either of the sorts of duals considered in Lang (p.145), the dual is isomorphic to the triple dual. In general, no. To see this, consider (1) abelian groups of exponent p, for p some prime, and (2) vector spaces over the field of p elements. These are essentially the same, and the two duals are the same in this case, so we only need to examine the situation once. Further, an infinite-dimensional vector space over a finite field k has the same cardinality as dimension. Now if E is an infinite dimensional vector space with basis B, we see that to every subset S of B we can associate an element of the dual Hom(E,k) which takes members of S to 1\in k and other members of B to 0. The set of such elements will have cardinality greater than that of B; so in the infinite-dimensional case, the dimension of the dual is strictly greater than that of the original space, and of course this continues to hold for higher duals. So the dimension of the triple dual is higher than that of the dual. This shows that one does not have isomorphism in general; but over rings that are not fields, there are some special classes of cases (other than finitely generated free modules) where one does have it. The second part of extra-credit exercise I.7.1 is related to this question. There is another curious fact. Let me write "*" for either kind of duality. Then we know that there is a natural map E --> E**; taking E* in place of E we get a natural map E* --> E***; on the other hand, applying the contravariant functor ( )* to the map E --> E** gives a map E*** --> E*. If I recall, the composite E* --> E*** --> E* is the identity, so E* can be naturally identified with a direct summand of E***, though this is not true in general of E and E**. ---------------------------------------------------------------------- Regarding the concept of algebraically closed field (p.178), you ask > Does it make sense to talk about algebraically closed Division Rings, > and if so are the quaternions an example of an algebraically closed > division ring? Yes to the first question, no to the second. The existence of division rings that are algebraically closed in a strong sense is proved in L. G. Makar-Limanov, Algebraically closed skew fields. J. Algebra 93 (1985), no. 1, 117--135. But the quaternions do not have that property. For example an equation over the quaternions that has no root is Xi - iX - 1 = 0. Note that the standard concept of algebraically closed field F says that all equations over the field that could be satisfied in an extension to a field (i.e., division ring satisfying the commutative identity) have solutions in F. The field of quaternions satisfies some identities weaker than commutativity; e.g., everything commutes with the square of every commutator xy-yx. One could set up a concept of a division ring that is algebraically closed relative to the class of division rings satisfying a given family of identities; and conceivably, the quaternions might be algebraically closed relative to such a class. ---------------------------------------------------------------------- You ask about the fact that Lang only refers to the cases of characteristic 0 and characteristic a prime (e.g., on p.179, Proposition IV.1.12). A ring of nonzero characteristic n contains a copy of Z/nZ, and this has zero-divisors unless n is prime. So as long as we are interested in integral domains, the only nonzero characteristics that will come up are the primes. ---------------------------------------------------------------------- You ask, regarding Theorem IV.2.3 on p.182, whether there is a ring A which is not a UFD, but such that the prime elements of A[X] are nevertheless the primes in A and the primitive polynomials irreducible in K[X] (where K is the field of fractions of A). I think so -- I think that it should not be hard to show that if A is the union of a chain of UFD's, A_1 \subset A_2 \subset ... , then it will have the latter property. But such a union need not be a UFD. For example, if one adjoins to a field k an indeterminate Y, and then a square root of Y, and a square root of that square root, etc. -- getting a ring that could be written k[Y, Y^{1/2}, Y^{1/4}, Y^{1/8}, ... ], then Y cannot be written as a product of irreducibles. Don't have time to think through the details, though. The grading of the second Math 113 midterm has put me way behind on preparation. ---------------------------------------------------------------------- Regarding Lang's comment on p.183, "It is usually not too easy to decide when a given polynomial (say in one variable) is irreducible", you note that the integral root test provides an algorithm for finding linear factors, and you ask whether there are similar algorithms for higher-degree factors. I don't know much about the subject. To say there is an algorithm is not to say it is easy -- there is an algorithm for factoring an integer N, namely "Test every integer \leq \sqrt N to see whether it is a factor", but we know that for large N that is a lot of work. In teaching Math 114 a few years ago, I did think about the question of finding not necessarily linear factors for a polynomial with integer coefficients, and came to the conclusion that at least there are only finitely many that need to be tested. This is sketched as exercises 3.18 and 3.19 (pp.2-3) in my packet from that course, http://math.berkeley.edu/~gbergman/ug.hndts/m114_IStwrt_GT3_exs.ps . ---------------------------------------------------------------------- You ask how Lang gets a contradiction at the end of the proof of the italic statement on p.192. Your second guess is right -- he is implicitly performing induction on n. Thanks for pointing this out; I'll add the explanation to the Companion. ---------------------------------------------------------------------- You ask (cf. Prop. V.1.4, p.225, Prop. V.1.6, p.227, and Companion, bottom of p.84 and top of p.85): > Is there a way to characterize when a field extension > $k(a_1, \cdots, a_n)$ is the same as its subring > $k[a_1, \cdots, a_n]$? This will happen if and only if a_1, ..., a_n are all algebraic over k; but the simplest proof I can see requires methods that it would be messy to develop at this point. Roughly, the argument is as follows: Suppose K = k[a_1, ...,a_n] is a field, but that a_1, ..., a_n are not all algebraic over k. Let us rearrange these generators, if necessary, so that a_1 is transcendental over k. Then, if not all of a_2,...,a_n are algebraic over k[a_1], rearrange them so that a_2 is transcendental, etc.. Renaming the transcendental elements x_1,...,x_p, we can thus write our field in the form k[x_1,...,x_p, b_1,...,b_q] where p > 0, x_1,...,x_p are algebraically independent over k, and b_1,...,b_q are algebraic over k[x_1,...,x_p]. Note that within the field K = k[x_1,...,x_p, b_1,...,b_q], the subfield L = k(x_1,...,x_p) generated by k[x_1,...,x_p] will be isomorphic to the field of fractions of k[x_1,...,x_p], and that the latter ring is (up to isomorphism) the ring of polynomials in x_1,...,x_p. Now since b_1,...,b_q are algebraic over the subfield L, [K:L] is finite, hence in terms of some basis of K over L, the operations of multiplying by each element of K can be written as a matrix over L; i.e., K can be identified with a ring of matrices over L. Now when b_1,...,b_q are written as such matrices, only finitely many of the irreducibles in the polynomial ring k[x_1,...,x_p] can occur in the denominators of the finitely many entries of these finitely many matrices. From this, one can derive a contradiction to the fact that every element of k[x_1,...,x_p] becomes invertible in K. ---------------------------------------------------------------------- You ask about the difference between diagrams (2) and (3) on p.228. I think that in (2), by making E close to k and EF close to F, Lang is trying to focus your attention on the extensions E/k and EF/F, and thus to suggest statement (2) on the preceding page, "if E/k belongs to \cal C, then so does EF/F". On the other hand, diagram (3) represents the situation of condition (3), which is symmetric in the extensions E/k and F/k. ---------------------------------------------------------------------- You ask why, on p.230, 3rd line of paragraph preceding Lemma V.2.2, Lang can say that EF is the field of fractions of E[F]. Good point. First, when he says it "is" the field of fractions, he means that it is isomorphic, as an extension ring of E[F], to that field of fractions. To show this, let us denote the field-of-fractions construction by "ff". Then the universal property of ff(E[F]) says that given any commutative ring R with a homomorphism of E[F] into it, under which all nonzero elements of E[F] become invertible, there is a unique extension of this homomorphism to a homomorphism ff(E[F]) --> R. In particular, since EF is by definition a field containing E[F], we get such a homomorphism h: ff(E[F]) --> EF. Since E[F] is generated as a field by its subfield E[F], the image of h can't be a proper subfield, so h is onto; and since ff(E[F]) is a field, and homomorphisms from fields to nontrivial rings are one-to-one, h is one-to-one. So it is an isomorphism. The general fact, of which this is a particular case, is that for any integral domain A, ff(A) is, up to isomorphism, the unique field containing (an isomorphic copy of) A, and generated by that subring. I should probably put that into the Companion. ---------------------------------------------------------------------- > Supposing that we are interested in finding extensions of an arbitrary > ring A that contains roots of certain polynomials. Are there > conditions when we can do a similar construction as k[x]/(p(x)) > as in the bottom of page 230 in lang? We can always adjoin generators and impose relations; i.e., form a polynomial ring A[X_1,...,X_n] (if we want to adjoin an n-tuple of elements satisfying some equations) and divide this by an ideal I. If we want the resulting ring to have no zero-divisors, then we need to make sure I is a prime ideal, which can be tricky (even when n is 1) unless A has good properties, e.g., is a UFD. If in fact A has no zero-divisors and we want our extension ring to have the same property, then the simplest way to do this will often be to form the field of fractions k of A, construct an extension k[x]/(p(x)) as in Lang, which, as long as p(x) is irreducible in k[x], is guaranteed to have no zero-divisors, and then take the subring of this generated by A \subseteq k and the image of x. ---------------------------------------------------------------------- You ask about Lang's proof of Lemma V.2.2, p.231. The key fact is the preceding comment to the effect that EF is the field of quotients of the set of elements a_1 b_1 + ... + a_n b_n. That means that an element is in EF if and only if it can be written as a fraction with one such element in the numerator and another such in the denominator, and (implicitly), the same statement for sigma(E) sigma(F). Lang now observes that sigma takes expressions of that form in elements of E and F to the corresponding expressions in elements of sigma(E) and sigma(F). So on the one hand, it carries EF into sigma(E) sigma(F); moreover, it goes _onto_ that subfield, since every element of that subfield has the form shown on the right-hand side of the display in the proof. ---------------------------------------------------------------------- Regarding p.237, proof of Theorem V.3.3 you write > Here Lang seems to prove NOR1 => NOR3 => NOR2 in the first paragraph > of the proof, and NOR2 => NOR1 in the second paragraph. So, is it true > that his third paragraph on NOR3 => NOR1 is excessive, ... Right! Good point. ---------------------------------------------------------------------- Regarding the proof of Theorem V.3.4, p.238, you note that > Lang says several times let \sigma be an embedding, where it is > understood that \sigma is an embedding in an algebraic closure. Only > once does Lang say this though. Is this particular to this proof, or > when dealing with normal extensions, or will this be a common > occurrence? Thanks for pointing this out. It's not any standard usage, that I know of, but we should see how extensively Lang does this. ---------------------------------------------------------------------- You ask about the inequality "# of automorphisms \leq # of embeddings" at the top of p.95 of the Companion (which I note in relation to material on Lang's p.240). Since E lies in k^a, every automorphism of E is, in particular, an embedding in k^a. (Or if one wants to be formal, every automorphism, when composed with the inclusion map E --> k^a, gives an embedding.) ---------------------------------------------------------------------- You ask how comparison of degrees in the 3rd display on p.248 gives [k(\alpha): k(\alpha^p)] = p. There \alpha has minimal polynomial f, and \alpha^p has minimal polynomial g. Now [k(\alpha): k(\alpha^p)] = [k(\alpha): k] / [k(\alpha^p): k] = deg f / deg g. And this ratio is p. I've listed these three steps briefly, not knowing which of them may have been your source of trouble. Let me know if the reason(s) for any of them is/are not clear. ---------------------------------------------------------------------- You ask why, on p.251 in the next-to-last sentence of the proof of Corollary V.6.10, E is separable over E^p k. Because the pth power of every element \alpha of E lies in that field. (Hence \alpha satisfies a polynomial, X^p - \alpha^p, with coefficients in that field which has only one root in an algebraic closure; hence the minimal polynomial of \alpha over E^p k, which must divide every polynomial over that field which \alpha satisfies, has only one root in the algebraic closure.) ---------------------------------------------------------------------- You ask why "perfect fields" (Lang, p.252) are so called. It's based on thinking of inseparability as a flaw -- something that makes it impossible to apply Galois Theory. Thus, not having any inseparable extensions is a "good" feature. Among the various words having the connotation "good", the one that was chosen for this feature, probably more or less at random, was "perfect". It may be that it wasn't entirely random. The original meaning of "perfect" was "complete". (Cf. the grammatical term "perfect tense", referring to verb forms like "has eaten", that say that an action has been completed.) So it could mean that the property "All elements of k^a that can't be distinguished from elements of k by the behavior of automorphisms _are_already_in_ k" was thought of as a "completeness" property of such fields k. ---------------------------------------------------------------------- Regarding the definition of a Galois extension (p.262) you ask how it was discovered that separability and normality were the conditions under which Galois Theory could be developed. I don't know the detailed history of the subject, but I know that its original form was very different from the way it is presented today. Galois and the other early workers didn't think about groups of automorphisms of field extensions, but groups of permutations of the roots (in field of complex numbers) of a polynomial, which preserved all algebraic relations satisfied by those roots. (So, for instance, if one writes the four roots of X^4 - 2 as alpha, i alpha, -alpha, and -i alpha, then the transposition that interchanges the first two roots does not preserve the relation saying that the first and third roots sum to zero and so does not belong to the Galois group as they understood it, while the transposition that interchanges the first and third does preserve all relations; it is what we would view as the restriction to the set of roots of a certain automorphism of the extension.) Since they were dealing with all roots of a polynomial, their groups of permutations of the roots corresponded to what we would see as groups of automorphisms of the splitting field of the polynomial, which is automatically a normal extension. On the other hand, since they were working within the complex numbers, which is of characteristic 0, separability was automatic. I assume that the need for separability was discovered when people tried to generalize the original results to arbitrary fields. One may ask: If the early workers viewed Galois Theory in terms of roots of equations rather than field extensions, how did they express the correspondence between subgroups and subfields? Again, I don't know the details, but I would guess that they looked at algebraic expressions in the set of roots, and for each subgroup of the permutations, considered those expressions invariant under that group. The set of complex numbers that could be represented by such expressions would (in modern language) describe a subfield of the splitting field. What would be the motivation for passing from "groups of permutations of roots" to "groups of automorphisms of field extensions"? Probably the realization that any two polynomials whose roots generated the same field had "the same" Galois group; hence that it was really a function of the extension, and not just of the particular polynomial. ---------------------------------------------------------------------- Concerning Lemma VI.2.c1 on p.102 of the Companion (discussion relating to p.270 of Lang), you ask whether we know that the expression for a symmetric polynomial in terms of the elementary symmetric polynomials generate is unique. That's the italicized statement on p.192, saying that the elementary symmetric polynomials are algebraically independent. It means that if we map the ring of polynomials in n indeterminates, k[X_1,...,X_n] to k[t_1,...,t_n] by the homomorphism sending each X_i to s_i, the map will be one-to-one. Hence each member of the image of that homomorphism is the image of a unique element of k[X_1,...,X_n], i.e., has a unique expression in terms of s_1,...,s_n. ---------------------------------------------------------------------- You ask about the assumptions of characteristic not 2 or 3 in Example 2, p.270. You are right that under the assumption that f is irreducible of degree 3, only "characteristic not 3" is needed to insure that f is separable. However, the assumption that the characteristic is not 2 is needed to conclude in the next paragraph that any odd permutation of the roots will move \delta. (It is always true that an odd permutation sends \delta to -\delta; but in characteristic 2 \delta and -\delta are the same.) So though Lang mentions the "not 2" assumption in an inappropriate place, that assumption is indeed needed for the analysis to work. Note that since S_3 has a normal series with factors Z_3 and Z_2, analysis of the Galois theory of a separable cubic involves considering successive Galois extensions with those two Galois groups. Extensions with Galois group Z_3 behave differently in characteristic 3 from other characteristics, while those with Galois group Z_2 behave differently in characteristic 2 from other characteristics, so both characteristics must be excluded to get the behavior that occurs in "most" cases. We'll learn a lot about extensions with Galois group Z_p, both in the "characteristic not p" and the "characteristic p" cases, in section VI.6. ---------------------------------------------------------------------- You ask how the restrictions on the characteristic guarantee separability in Example 2, p.270. I'm surprised that I can't find a really explicit statement of the fact in question in Lang. Anyway, it can be seen from Prop.V.6.1, p.247, that an irreducible polynomial over a field k can be inseparable only if k has characteristic p, and then only if the degree of the polynomial is divisible by p. (This is pointed out explicitly in the Companion, in the second paragraph of my comment on that page.) ---------------------------------------------------------------------- You ask how to get the subfields of the field discussed on p.271 corresponding to each subgroup shown on that page, noting that some of these subgroups do not leave any roots of X^4 - 2 fixed. I hope to show how to do this some time soon in class. But you should be able work it out yourself. Remember that the typical element of Q(alpha, i) is of the form a + b alpha + c alpha^2 + d alpha^3 + e + f alpha i + g alpha^2 i + h alpha^3 i. Given any one of those subgroups, it is easy to write down conditions for the above expression to be fixed by the elements of that subgroup. It sounds as though you were looking for members of the basis {1, alpha, alpha^2, alpha^3, alpha i, alpha^2 i, alpha^3 i} that were fixed; but the subspace of elements in a vector space V fixed by a group of linear maps need not be spanned by those members of a given basis of V that happen to be fixed by the group. Perhaps what was misleading was that in this case, the basis is sufficiently nicely chosen so that the subspaces corresponding to some of the subgroups _are_ spanned by subsets of the basis. But not all. ---------------------------------------------------------------------- You ask why, in Example 7 on p.274; the polynomial X^5 + X + 1 is irreducible in F_5. This follows from Theorem VI.7.4(ii), p.290. (As I say in the Companion, this example uses later material). Specifically, we know that X^5 - X is identically zero on F_5, so it never assumes the value 1, so X^5 - X - 1 has no root in that field, hence by that theorem, it is irreducible. ---------------------------------------------------------------------- You ask how Gauss's Lemma is applicable in the 3rd line of the proof of Theorem VI.3.1, p.278. Note that since ord_p(1) = 0 for all p, a polynomial f having a coefficient equal to 1 must have ord_p f \leq 0 for all p. Hence if the product of two such polynomials has coefficients in our given UFD (here, Z), i.e., has ord_p \geq 0 for all p, we see from Gauss's Lemma that these orders are all 0; in particular, the polynomials have coefficients in our UFD. Thus, a factorization of a monic polynomial over Z into monoic polynomials over Q must in fact be a factorization into polynomials over Z. ---------------------------------------------------------------------- Concerning the definition of "character" (p.282), you ask why they are so named. There is seldom a solid answer to why some everyday word was chosen as the name of a mathematical concept; but I can say a bit about the more general meaning of "group character" of which the usage Lang gives is a special case. If a finite group G acts by vector-space automorphisms on a finite-dimensional vector-space V, this is called a "(linear) representation of G". This can be thought of as a module over the group algebra kG, and I'll use that viewpoint because we have learned basic module-theoretic language (though group theorists most often speak directly about representations without going via the language of modules). Now by choosing a basis for V, we can represent the action of each element of G by an invertible matrix over k; but this map to the matrix ring isn't an invariant of the representation because it depends on the choice of basis. However, the function that associates to each member of G the _trace_ of the corresponding invertible matrix is independent of choice of basis, and turns out to carry very powerful information about the representation. It is this that group-theorists call the "character" of the representation. (See Lang, section XVIII.2.) I suppose the idea behind the choice of that word was "fundamental feature". The most important representations that group theorists study, the "irreducible" representations, are the simple kG-modules (modules with no proper nonzero kG-submodules). If G is abelian, it turns out that all irreducible representations over an algebraically closed field are 1-dimensional. In this case, such a representation is just a homomorphism of G into the multiplicative group of 1-by-1 matrices, which is the multiplicative group of k, and the trace of such a matrix is just its one element; so characters of such a group are just homomorphisms into the multiplicative group of k. That is the restricted meaning of the term that Lang introduces on p.282. ---------------------------------------------------------------------- You ask about the last line of p.284, saying that for an inseparable extension, the trace function is 0. To see this, remember that if [E:k]_i > 1, then [E:k]_i is a power of the characteristic. Hence as an element of the field, it is 0. ---------------------------------------------------------------------- > ... p.287 of Lang middle of the page 3rd display from the bottom > "The polynomials (f(x)/(x-\alpha_i))*(\alpha^r)/(f'(\alpha)) > are all conjugate to each other." > > What does it mean for polynomials to be conjugate to each other? On p.243, second paragraph, Lang defined elements to be conjugate if they were images of the same element under various embeddings. In the case of a Galois extension, this is equivalent to saying that they are in the same orbit under the action of the Galois group. Here he is implicitly extending this action of the Galois group to polynomials, by letting members of the group act on the coefficients; thus he means that these polynomials lie in the same orbit under that action. ---------------------------------------------------------------------- Regarding the last display on p.287, you ask why the \alpha there has no subscript. (Actually, this applies to the last two displays.) The operator "Tr" sends its argument to the sum of its images under all the embeddings of k(\alpha) in k^a. These embeddings take \alpha to \alpha_1, ..., \alpha_n respectively. It is true that (assuming k^a is taken to contain k(\alpha)) the element \alpha will be one of those \alpha_i, and we could arbitrarily choose it to be \alpha_1, but that isn't relevant here. Since we started with \alpha in the statement of the theorem, we may as well write it here, knowing that when Tr is applied to the expression, we will get a sum of expressions involving the \alpha_i with which we have been computing above. ---------------------------------------------------------------------- You note that Lang hasn't spelled out the steps of the proof of Prop.VI.7.1, p.291. Right. That means that he expects that you can fill in the details. You first need to check the definition of "distinguished class", if you aren't sure of it, and note the assertions that need to be proved, and what they mean in this case. Then go through them and see whether his proof contains what is needed to get them. If you are uncertain about some point, then ask about it! ---------------------------------------------------------------------- Regarding ruler-and-compass constructions (Companion, note on p.293), you mention reading in Dummit and Foote that one can construct with a ruler on which a unit distance is marked things that one cannot construct with an ordinary ruler and compass, and ask how this is possible. Such constructions are implicitly based on the assumption that one can use such a tool in a certain way: Put one mark of the ruler on some line, move the ruler so that that mark moves along the line, while the edge of the ruler always passes through a specified point (e.g., by putting a peg at that point and sliding the ruler against the peg), and tracing out the curve that the other mark on the ruler moves through. Finally, it is assumed that one can find the point where this curve intersects another line or curve. This is a plausible interpretation of how one could use such a ruler, but it bothers me that it is taken for granted, rather than stated explicitly. Perhaps there is a touch of showmanship in descriptions of what one can do with this construction -- surprising the reader by bringing in an ingenious real-world image of how a ruler might be used. In any case, one cannot do this with ordinary straightedge and compass. One could find as many points as one likes on the curve referred to above; one could approximate its point of intersection with, say, a line, to any degree of accuracy by repeatedly constructing more points. But in a finite number of steps one couldn't find the exact point of intersection with that line. ---------------------------------------------------------------------- You ask why, in the proof of Theorem VI.8.2, p.295, the assumption that B_2/(k^{*m}) is finitely generated implies that it is finite. Because it has exponent m. (The m-th power of every element of B_2 is, in particular, the m-th power of an element of k*, hence it is in the "denominator" of B_2/(k^{*m}); so every element of the factor group has m-th power 1.) In an _abelian_ group generated by finitely many elements x_1, ..., x_r of nonzero exponents n_1,...,n_r, every element can be written x_1 ^{a_1} ... x_n ^{a_n} with 0\leq a_i < n_i, so such a group is finite. ---------------------------------------------------------------------- You ask why, in the 6th line of the proof of Theorem VI.8.2, p.295, one has k((B_2)^{1/m}) = k((B_3)^{1/m}). Since B_2 is contained in B_3, we have k((B_2)^{1/m}) contained in k((B_3)^{1/m}). On the other hand, it was assumed two lines earlier that k(b^{1/m}) is contained in k((B_2)^{1/m}), and the reduction to the case B_2 finitely generated was done so as to preserve that property, so it is still true on this line. Since B_3 is generated by B_2 and b, we see that k((B_3)^{1/m}) will be contained in k((B_2)^{1/m}). ---------------------------------------------------------------------- You ask why, in the first line of the second paragraph proof of Theorem VI.8.2, p.295, the injectivity (one-one-ness) of the map from groups to field extensions follows from the first paragraph of the proof. Lang proves there that any inclusion of fields k(B^{1/m}) implies an inclusion between the corresponding groups B. Since equality of two sets is equivalent to inclusions both ways, equality of two fields of the form k(B^{1/m}) implies equality of the groups. That says that the map from the groups to the fields is one-to-one. ---------------------------------------------------------------------- You ask whether a polynomial X^n - a (Lang, p.297) can split into linear factors some of which are repeated but others not. No. Let p be the characteristic of k. We know that if p|n, then over the algebraic closure of k, X^n - a factors as (X^{n/p} - a^{1/p})^p, and factoring the factor X^{n/p} - a^{1/p} into linear factors, we see that each of these linear factors is a repeated factor of X^n - a. On the other hand, if p does not divide n, then X^n - a and its derivative are relatively prime (it is easy to write a, and hence 1, as a linear combination of them), so by Proposition IV.1.1, there are no multiple roots. ---------------------------------------------------------------------- Regarding corollary VI.9.3, p.299, you ask > Is it usually possible, given an algebraically closed field K of > characteristic 0, to find a subfield k such that K/k is of degree 2? > It seems that one could look at a finite group of automorphisms of K > and try to find a subgroup of order 2, but the problem is whether this > finite group exists, and if it does, whether it has a subgroup of > order 2. ... It follows from Corollary VI.9.3 that if the Galois group of K/k has any finite nontrivial subgroups, then all such subgroups are of order 2 ! Typically, there will be many such automorphisms. For instance, note that the rational function field Q(X) can be embedded in the complex numbers by sending X to a real transcendental number, to a pure imaginary transcendental complex number, or to one that is neither (and that in the last case, there are three possibilities as to whether the real and/or imaginary part of the image of X is transcendental). If we let K be the algebraic closure of Q(X), then our embedding of Q(X) in C must extend to an embedding of K in C. In the cases where the image of X is real or pure imaginary, the image of Q(X) will be closed under complex conjugation, and one can show that the same will be true of the image of K. (One can also do this in some cases when the image of X is neither real nor pure imaginary -- probably whenever its real and complex parts are algebraically dependent, though I haven't thought this through.) Hence complex conjugation induces an automorphisms of K via each of these embeddings; but these automorphisms are different, because one fixes X while the other sends it to -X. An automorphism with the former property is conjugate to an automorphism with the latter property via an automorphism of K that sends X to iX. I don't know whether all order-2 automorphisms of K are conjugate. ---------------------------------------------------------------------- You ask about the statement on the top line of p.301, that the subgroup shown at the bottom of the preceding page is the commutator subgroup. I can't make sense of the explanation Lang gives ("because the factor group ..."). Here's what I've been able to work out. Though Lang says "for arbitrary n" two lines above the display of that subgroup, we should take that only to mean that he is no longer restricting to prime n; but still assuming n odd. Note that for n odd, 2\in (Z/nZ)*. Now let A be the matrix one gets from the expression "M = ..." in the third display on p.300 by taking b = 0 and d = 2, and let B be the matrix one gets by taking d = 1 and any b. Then you will find that A B A^{-1} B^{-1} = B, showing that B is in the commutator subgroup, as Lang claims. This does not remain true if n is even. In that case, every d\in (Z/nZ)* is odd, and if you compute commutators, you will find coefficients "d-1" on the elements in the lower left position of your matrices, from which you can deduce that every member of the commutator subgroup is of the form at the bottom of p.300 with an _even_ entry b; so we don't get all the elements he claims. ---------------------------------------------------------------------- You asked about results such as the proof that R(sin\theta, cos\theta) is pure transcendental, noted in the Companion, p.126, in the note to p.357 of Lang. I don't know much about such results, but there is a heuristic that leads to the result in that case. Note that the field is generated by the coordinate functions on the circle. Take it for granted that what one wants is equivalent to a way to parametrize points of the circle by a single real parameter, using rational functions. If one tries something "obvious" like intersecting the circle with vertical lines differing in their x-coordinates, one runs into the trouble that every line hits the circle twice; so the point of intersection is the solution to a quadratic equation, which is not a rational function of x. However, suppose instead that one fixes a point of the circle, say (-1,0), and intersects the circle with lines passing through that point, parametrized by their slope. There will still be two solutions for every slope, but one of these is known, the point (-1,0), and when one knows one solution of a quadratic equation, one can find the other solution by ring operations (without using square roots). And this indeed gives a parametrization of the circle by rational functions of a single variable. (The point (-1,0) itself appears when that variable goes to infinity.) Moreover, by some elementary geometry, the angle that line makes with the x-axis is half the arc cut off between (1,0) and its other intersection with the circle. If we call the general point of the circle (cos\theta, sin\theta), this says that the slope of the line is tan(\theta/2). This explains the fact that one can express sin\theta and cos\theta as rational functions of tan(\theta/2) and vice versa. Algebraic geometers use this principle in other ways. If one has a cubic curve in the plane, a line will in general intersect it in three points. But if the cubic has a double point, then a line through that point can be considered to intersect it twice there, and will have just one other point of intersection with it. This leads to the fact that a cubic with a double point has a rational parametrization, while other cubics do not. ---------------------------------------------------------------------- You ask how one would deduce, as I indicate in the Companion, (p.126, note to follow end of VIII on p.357 of Lang) that real or complex functions e^{a_1x}, ... e^{a_nx} are algebraically independent if and only if a_1, ..., a_n are linearly independent over Q, from the assertion that e^{c_1x}, ... e^{c_mx} are linearly independent if and only if c_1,...,c_m are distinct. If R is a commutative ring containing a field K, the condition that a family of elements x_1,...,x_n of R is algebraically independent over K is equivalent to saying that the family of all monomials in these elements, (x_1^{d_1}...x_n^{d_n})_{d_1,...,d_n\geq 0}, is K-linearly independent. Now the monomials in the functions e^{a_1x}, ... e^{a_nx} are the functions e^{(d_1 a_1 + ... + d_n a_n)x} (d_1,...,d_n\geq 0), and from the assertion quoted above, these are linearly independent if and only if the expressions d_1 a_1 + ... + d_n a_n (d_1,...,d_n\geq 0) are all distinct, which holds if and only if a_1, ..., a_n are linearly independent over Q. (That last "if and only if" takes a little thought. Given a linear dependence relation relation over Q, one can clear denominators, and move the negative terms to the other side of the equation, to get an equality of linear combinations with nonnegative integer coefficients.) ---------------------------------------------------------------------- Regarding my comment in the Companion on the proof of Theorem A2.1.3 (p.877), where I say that by making infinitely many choices of elements, Lang implicitly uses the Axiom of choice, and that even making the needed choices as a countable sequence and saying "and so forth" would implicitly require that axiom, you ask why induction doesn't suffice. The best I can do is offer an analogy: There are uncountably many sequences of 0's and 1's, so it is not true that for every such sequence, there is a finite computer program that (running on a machine that never wore out) would give that sequence. Yet for every such sequence and every n, there is certainly a computer program that will give the first n terms of that sequence; e.g., you can include those terms in the program, and just have it spew them out. Here, similarly, having every finite case "doable" doesn't mean -- without the assumption of the Axiom of Choice -- that the whole task is doable. As for induction being applicable -- all that induction tells us is that for every n there exists an n-tuple (x_1,...,x_n) with x_i\in X_i; it doesn't say that there exists a natural-numbers-tuple (x_1,...) with that property. Not being an expert in the field, I can't give a precise explanation, but my understanding is that set theorists, given a model of set theory in which the Axiom of Choice holds, can construct within it a model in which that Axiom fails -- that is, they define certain sets and only those to be the "sets" of that theory, and verify that this system satisfies the other Zermelo-Fraenkl axioms, but not Choice. ---------------------------------------------------------------------- You ask, with respect to the proof of the Axiom of Choice from Zorn's Lemma, "what chain is {p}?" As stated on the first page of the material on the Axiom of Choice etc., "A subset S of a partially ordered set P is called a chain if it is totally ordered under the induced ordering." So any one-element subset of a partially ordered set is a chain, by definition, since it has no incomparable elements. ---------------------------------------------------------------------- You ask about the statement you had heard that "everything is equivalent to the axiom of choice". My first reaction was "Nonsense!", but thinking a little further, I have a guess as to what was meant: It is that every appropriately general statement that requires the axiom of choice for its proof is in fact equivalent to that axiom. As an example of what I mean by "appropriately general" -- the statement "there exists a well-ordering of the real line" requires the axiom of choice to prove it, but it is not equivalent to the axiom, but only to the case of that axiom where certain of the sets involved have cardinality _< that of the real line. But the statement "all sets can be well-ordered" is, as shown in the reading in the Companion, equivalent to the axiom of choice. I believe that, similarly, the major results of algebra that one proves using the axiom of choice -- that every vector space has a basis, that every ring has a maximal ideal, etc. -- all turn out to imply the axiom as well. Loosely, the statement you heard means "If one can reasonably ask whether something is equivalent to the axiom of choice, the answer will be yes." Of course, even so interpreted, the statement is doubtless an exaggeration; I am sure set-theorists can come up with statements that are strictly weaker than the axiom of choice without being restrictions of more natural statements that are equivalent to that axiom. But it is nonetheless an impressionistic summary of the repeated outcome when one sees a result proved from the axiom of choice, and raises the question of whether the implication is reversible. ---------------------------------------------------------------------- You're right that where Lang writes "card(M) <= card(A)" and "by Bernstein's Theorem" near the top of p.889, neither is needed! Shall include this in my next set of errata to him. ---------------------------------------------------------------------- You ask whether there is a way to explain why the one-word proof of Corollary A2.3.7 (p.889), "Induction", doesn't work in the infinite case. I'm not sure in how deep a sense you are asking this. That proof doesn't work in the infinite case simply because mathematical induction is a technique that gives you results for all natural numbers (finite ordinals) but not for infinite ordinals. If one thinks it should extend to the first infinite ordinal, one gets absurd results; e.g., though induction using the statement "if n is finite, then so is n+1" gives the correct fact that every natural number is finite (a tautology, but that's beside the point), if we claimed it extended to infinite ordinals, it would give the false conclusion "the first infinite ordinal, omega, is finite". There are techniques having somewhat the role of induction for infinite arguments, of which we have learned one, Zorn's Lemma. But one doesn't get the infinite case for free; one has to prove something more than is needed for prove things in the finite case (otherwise, as I have said, things true in the finite cases would all be true in the infinite case, which simply isn't so). In Zorn's Lemma, this is the condition "the partially ordered set is inductive". In any proof of this sort that would give the result of Corollary A2.3.7 for denumerable products, this step would be a description of how the products over larger sets can be constructed from products over smaller sets, and a proof that this method of construction preserved cardinalities. There is a construction that gives large (e.g., denumerable) products in terms of smaller (e.g., finite) products; it is called "the inverse limit". (This is developed in Lang section I.10; we don't have time to cover it in 250A and I don't have any material on it in the Companion. It is covered in sections 7.1-7.2 of my Math 245 notes; these require the basic concepts of category theory, which we will get in sections I.11-I.12 of Lang.) But inverse limit does not preserve denumerability! Exercise A2:L13 shows that many cardinalities, are indeed preserved by denumerable products, and Lang originally asked whether this was true of all sufficiently large cardinalities. But as his added note shows, this is not so; details are sketched in the optional material in the Companion. ---------------------------------------------------------------------- In relation to Corollary A2.3.7, p.889, you ask > How would one go about finding the cardinality of the direct product > of an infinite set A with itself denumerably many times? There isn't a real sense in which one can "find" this, because the properties of the arithmetic of infinite cardinalities depends on the axioms of set theory (in addition to the standard ones) that one assumes. But for some partial information, see Lang's Exercise A2:L14. ---------------------------------------------------------------------- For A an infinite set, you ask how large a set B must be so that the cardinality of the set of functions from A to B would be larger than that of the set of functions from A to {0,1} (cf. Theorem A2.3.10, p.890). B would have to be bigger than the set of subsets of A. To see this, let us use the notation of the arithmetic of ordinals as in Exercise A2:L8. Calling the cardinalities of A and B alpha and beta, we see that if, on the contrary, beta _< 2^alpha, then beta^alpha _< (2^alpha)^alpha = 2^(alpha x alpha) = 2^alpha. On the other hand, for beta > 2^alpha, we get beta^alpha >_ beta > 2^alpha. ---------------------------------------------------------------------- Regarding Corollary A2.3.11, p.891, you ask > Given an infinite set A and a set B of cardinal less or equal to A, > what is the relation between the cardinal of A and the cardinal of > the subsets of A of cardinal equal to B? It is easy to see that it lies between the cardinality of A and the cardinality of the set of subsets of A, and that it is at least the cardinality of the set of subsets of B. Beyond that, I don't know what one can say. ---------------------------------------------------------------------- You write that in the proof of the Schroeder-Bernstein Theorem (p.885), the division of A as A_1 union A_2 seemed "an artifice for the proof" rather than the intuitive argument you had been expecting. Well, there is a neat intuitive idea, and then there is an artifice needed to make it work out. The idea is to consider chains of elements x, f(x), g(f(x)), f(g(f(x))), ..., alternately in A and B, (continued backwards as well as forwards if the element on the left is in the image of g or f respectively). This can be pictured by drawing a red arrow from every element a\in A to f(a)\in B, and a blue arrow from every b\in B to f(b)\in A, and looking at each element as lying in a unique "ladder" of such arrows. Then the idea is to get our correspondence by "switching adjacent pairs" in each ladder. But which way to do the switch -- should we switch the pairs connected by red arrows or those connected by blue arrows? One method will leave the elements that are at the top of a ladder that begins with a blue arrow unmatched; the other will do this to the element at the top of a ladder that begins with a red arrow. (Either will work for a ladder that has no top.) So we use one method for one type of ladder-with-a-top, the opposite way for the other type, and make an arbitrary choice for ladders with no top. So -- elegant proofs involve both intuitive ideas, and technical tricks needed to make them work! ---------------------------------------------------------------------- You ask > ... are most set theories that do not allow [the set of all sets] > free from inconsistency, or do they run into problems elsewhere? ... The only axiom-systems that set-theorists are interested in looking at are those that, so far as they know, are consistent; so "all" viable set-theories have this property. My understanding is that set theorists have proved various systems "consistent if arithmetic is consistent", but know that the latter cannot be proved, but only assumed. Finally, there are a lot of axioms that they look at (especially large-cardinal conditions) that they strongly suspect are consistent with the existing axioms, but for which they don't have any proofs of consistency. One member of the Berkeley Math Department, John Silver, is a maverick in this respect; he believes that these large-cardinal conditions probably lead to contradictions; but so far as I have heard, he has never been able to prove such a result. My sense is that since Russell's paradox, set theorists have been aware of where the dangers lie, and have been able to avoid them. But Silver thinks otherwise. ----------------------------------------------------------------------