ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 113, Fall 2001, taught from Beachy and Blair's "Abstract Algebra", 2nd ed. ---------------------------------------------------------------------- You ask why, as mentioned in the last sentence on p.x, giving examples that are model exercises works in Calculus but not in Abstract Algebra. Most of what a Calculus course teaches students is how to perform certain kinds of calculations. What a course like this teaches, on the other hand, is how to understand and reason about algebraic structures, and express your reasoning in proofs. Examples that are model exercises can teach your how to perform calculations, but not how to reason. The above contrast is a simplification, of course. There is some reasoning involved in Calculus, and we would like students to develop the ability to reason about the subject themselves; but most of the students taking Calculus are "allergic" to proofs, and we end up with a compromise: Much less emphasis on reasoning and understanding and proofs at that level than we would like, but more than most of the students would like. And there are also a few computational techniques developed in Math 113; but they are not the core of the course. Most of the mathematics taught at the upper division and graduate levels concerns understanding and proofs, not computational techniques. There is also some mathematics of that sort taught at lower levels. If you took High School geometry, that concerned proofs rather than computations. A few decades ago, there was something called the "New Math", that introduced abstract ideas and proofs in Elementary School, and aimed to develop understanding and not just computational skills. The trouble was, the teachers who were supposed to teach it didn't understand it, so the result was total failure. You mention that the examples in the text are very different from what you have to do in the homework. But there is something in the text that is very much like what you have to do in the homework -- the proofs of Lemmas, Theorems, etc.! It is those that you should use as your models. They aren't models where you "just substitute in different numbers", but they are models of how a proof works. (What the examples in the text are for is to familiarize you with the things that are being talked about, so that you understand them more clearly, and will be able to create correct proofs.) I hope this helps. ---------------------------------------------------------------------- You ask (in reference to the heading "I'm no good at writing proofs" on p.xi) how much detail needs to be included in proofs, and what can be assumed without justification. Those are important questions, but hard to answer! As a starting point, I would suggest taking the proofs in the text as your model, but supplying a little more detail than they do. Often they say something is "clear"; but this usually means that it follows from a result that has been proved, and implicitly challenges the reader to figure out what the reason is. But when doing homework, you are supposed to be showing that you know the material rather than challenging the grader to figure out why something is true. An important general comment I will make is that when a proof has hard parts and easy parts, the student who gives a lot of detail on the easy parts and leaves out or makes vague remarks in place of the hard parts will get very little credit, while leaving out details about the easy parts will cost less. If in doubt whether a point can be called "clear", you might think through what you _would_ have to say to justify it thoroughly. If it turns out to be more complicated than you had realized, you should probably give details. Finally, when in doubt, you can come to office hours and ask my advice. ---------------------------------------------------------------------- You ask in connection with the example on p.1 whether every matrix with determinant 1 has the property that when we take its powers, these eventually repeat. No. The matrices / 1 1 \ \ 0 1 / and / 0 1 \ \ 1 1 / are examples which don't have this property. (You might enjoy computing the first few powers of each of them. They show rather different patterns.) ---------------------------------------------------------------------- You ask why on p.6, six lines after axiom 1.1.2, s-|b| is the smallest element in S. Beachy and Blair have defined a new set, which they don't name, but let's call it T. For each n in S, n+|b| belongs to T and for each m in T, n-|b| belongs to S; so elements of S and T correspond to one another, in a way such that larger elements correspond to larger elements and smaller elements correspond to smaller elements. So if s (a poor choice of letter on their part!) is the least element of T, then the corresponding element of S, namely s-|b|, will be the smallest element of S. ---------------------------------------------------------------------- You ask why the elements a - bq in the set R on p.7 must "be divisors of q". In that expression, the symbol "|" doesn't mean "divides", it means "such that"! I didn't realize there was an ambiguity, because my eyes are accustomed to looking for a "such that" sign when I see set-brackets. But someone else in the class also read it that way. Thanks for pointing this out; I will mention the problem in my letter to the authors at the end of the Semester. ---------------------------------------------------------------------- You ask why, in the proof of the Division Algorithm (p.7), they can set q = -|a| in the formula "R = { a - bq | q \in Z}". That formula means that R is the set of numbers a - bq that one gets by letting q take on _all_ values in Z. So as we let q run through all the values in Z, _one_ of the values is -|a|, so _one_ of the elements of R is a - b (-|a|) = a + b |a|. So if you can see that that element is nonnegative, you are done. Do you see that that element is nonnegative? (I mentioned why in class.) ---------------------------------------------------------------------- You ask how one shows, in the proof of Theorem 1.1.6, p.9, that any GCD is a linear combination of a and b. Note the statement 6 lines before the beginning of the Theorem, "the question of uniqueness is easily answered ...". There they show that a pair of elements cannot have more than one g.c.d.. Now in the proof of Theorem 1.1.6, they show that a and b do have a g.c.d. which is a linear combination, hence since a and b can't have more than one g.c.d., they can't also have a g.c.d. which is not a linear combination! ---------------------------------------------------------------------- You ask how, on p.10, second full paragraph, the conditions (b,r)|a and (b,r)|b lead to the conclusion (b,r)|(a,b). Those two conditions say that (b,r) is a _common_divisor_ of a and b, hence it must divide the _greatest_ common divisor of those elements. ---------------------------------------------------------------------- You ask why 1 isn't considered a prime number (p.17, Definition 1.2.4). Many things would go wrong if we considered 1 a prime number; which answer(s) to give you depends on which you would find most convincing. I'll give just one here: If we considered 1 a prime, then we would not have unique factorization. For instance, as factorizations of 6 into primes we would have 2 . 3, 1 . 2 . 3, 1^2 . 2 . 3 1^3 . 2 . 3, etc. We want the primes to be our "building blocks" in expressing positive integers as products, but "1" isn't much good as a "building block". ---------------------------------------------------------------------- You ask why, in the last sentence of the proof of Lemma 1.2.5, p.17, the authors can say that either p|a or p|b would lead to a contradiction. As stated in the preceding sentence, a and b are positive integers smaller than p. But if p|a, that means a would be a multiple of p, hence it can't be smaller than p; and similarly for b. You end, "I thought in the lemma, we wanted our result to be that p|a or p|b." We want that conclusion to hold _if_and_only_if_p_is_prime. We have already shown that if p is prime, that does hold. Now we are considering the case where p is not prime, and we are showing that in this situation it does not hold. ---------------------------------------------------------------------- You ask how the Sieve of Eratosthenes (p.17) originated. Well, if a positive integer n is not a prime, then it must be divisible by a prime < n. Hence to tell whether n is a prime, you don't have to test for divisibility by all positive integers < n; it is enough to test for divisibility by the primes < n. Suppose you want to find all the primes up to some integer such as 100. Then it is most efficient not to take them one by one and test them, but to set up an "assembly line" where first you eliminate all integers bigger than 2 (in the given range) that are divisible by 2; then all integers bigger than 3 that are divisible by 3; then all integers bigger than 5 that are divisible by 5, and so on. These "2", "3", "5" etc. are, of course, the primes; and the neat thing about this "assembly line" procedure is that as it discovers a prime, it calls it into service in excluding other integers that are multiples of it. ---------------------------------------------------------------------- You ask how, in the first paragraph of the proof of Theorem 1.2.6, p.18, the authors conclude that c and d have factorizations into primes. They have assumed that b is the _smallest_ positive integer not having factorizations into primes. Now c and d are positive integers smaller than b; so being smaller than the smallest value without a prime factorization, they must have prime factorizations. ---------------------------------------------------------------------- You ask how, in the second paragraph of the proof of Thm 1.2.6 (p.18) Euclid's lemma implies that q_1|p_k for some k. Well, Euclid's Lemma says that if a prime divides a product of two integers, it divides one of them. From this it follows by induction (though the authors don't say this explicitly) that if a prime divides a product of any number of integers, it divides one of them. From the second factorization of a, we know that q_1 divides a; writing a as in the first factorization, this says that q_1 divides p_1 ^alpha_1 ... p_n ^alpha_n, a product of factors p_1 , ... p_n. So Euclid's Lemma tells us that q_1 divides one of those factors, i.e., one of the primes p_k. ---------------------------------------------------------------------- You ask about the last steps in the proof of the Fundamental Theorem of Arithmetic (bottom of p.18), after the proof that p_1|q_j for some j and q_1|p_k for some k. First, if one prime divides another, they must be equal (clear?), so as they say, p_1 = q_j and q_1 = p_k. In the next sentence, the authors write "Then j = k = 1 since", so the calculation at the end of that same line must be the justification for that claim. The inequality q_1 _< q_j holds because the q's are arranged so that q_i is the smallest; the other "_<" holds for the analogous reason, and the two "="s are what has just been shown. But note that this gives a sequence of "="s and "_<"s which begin and end with the same number. This can only happen if all the terms are equal! Do you see that this gives "j = k = 1" and the facts needed to get the next display? ---------------------------------------------------------------------- You ask about the equation a/p_1 = a/q_1 in the last display of p.18. This holds because the inequalities q_1 _< ... p_1 _< ... q_1 immediately preceding imply that p_1 = q_1. (They should make that explicit. I'll mention this in my letter to them at the end of this semester.) ---------------------------------------------------------------------- You asked about the contradiction referred to in the last sentence on p.18, with which the authors complete their proof of uniqueness of factorizations. Note that in the first sentence of that paragraph, they begin "If there exists an integer ... for which the factorization is not unique", and they then choose a to be the _least_ such integer. But assuming this, they find that there must be a smaller integer with non-unique factorization, and that contradicts their choice of a as the least such. Thus the assumption that there are integers for which the factorization is not unique leads to a contradiction, so factorization is always unique. ---------------------------------------------------------------------- You ask whether there is another way to prove the Fundamental Theorem of Arithmetic (Theorem 1.2.6, p.18) than by contradiction. I think any proof of the theorem will involve the same ideas as this proof, but the argument doesn't have to be stated in terms of a contradiction; one can also word it as a proof by induction. Instead of saying "Assume a is the least integer for which unique factorization fails" and getting a contradiction, one can say "Assume by induction that unique factorization holds for all integers s < a" and then prove that it will also hold for a. The idea is to consider any two factorizations of a (which we want to prove are the same), perform the same calculations that the authors use, and when you come to the last display on p.18, note that by the inductive assumption, the two factorizations of s must be the same, and that makes the two factorizations of a the same, as required to complete the inductive proof. ---------------------------------------------------------------------- You ask how, at the bottom of p.19, we get p|(a - p_1 p_2...p_n). We have p|a by assumption, and have just noted that p|p_1 p_2...p_n. If an integer divides two numbers, then it also divides their difference. ---------------------------------------------------------------------- You ask whether it is OK, in a proof, to justify a step by saying "This is a familiar fact", as the authors do with the factorization displayed in the middle of p.20. That's a good question, but the answer depends very much on the situation! If a given course expressly assumes the results of another course, then you should be able to use the results of that other course. In the opposite direction, a course like Math 104 _rebuilds_ rigorously the concepts that were presented to you in a hand-wavy fashion in Math 1AB, so there it is not safe to say something is true by a result from 1AB. However, when a result is verifiable by an easy calculation (in this case, multiplying out the right-hand-side of the display and seeing that the result is the same as the left-hand side), then that is what really matters, and mentioning that it is familiar is just an aside; a way of being "reader friendly". So you'll just have to try to judge the context. If in doubt, ask, or if you can't ask (e.g., on an exam, or the night before homework is due), play it safe and give the details if you have time. ---------------------------------------------------------------------- You ask how the proof of Proposition 1.2.9 on p.21 "follows immediately from the fundamental theorem of arithmetic". I would have said a bit more than the authors did. The reasoning is as follows. Suppose we are given two integers u = p_1 ^m_1 ... p_ ^m_r v = p_1 ^n_1 ... p_ ^n_r (where "_" means the next symbol is lowered, and ^ means the next symbol is raised -- I'll provide a handout on the representation of such things in e-mail soon). Then u|v if and only if m_i is less than or equal to n_i for all i. "If" is clear, while "only if" is proved using unique factorization. This criterion for divisibility has been used implicitly in the discussion of the set of factors of an integer on the preceding pages. Knowing this criterion, we easily deduce that for a and b as in Proposition 1.2.9, an integer will divide both a and b if and only if it can be expressed as a product of powers of p_1, ..., p_n, with the exponent of each p_i less than or equal to both alpha_i and beta_i, i.e., less than or equal to min{alpha_i, beta_i}. The biggest integer with these properties is the one where for each i the exponent is exactly min{alpha_i, beta_i}, and this gives statement (a). Statement (b) is seen similarly. ---------------------------------------------------------------------- You ask about the statement on p.23, after Example 1.3.1, that if x = c is a solution of the equation a_k x^k + a_k-1 x^(k-1) + ...+ a_0 = 0, then the result of substituting x = c in that polynomial must be divisible by every integer n The result of substituting x = c in that polynomial is 0, and 0 is divisible by every integer n ! ---------------------------------------------------------------------- You ask why on p.24, second paragraph of proof of Prop. 1.3.2, the condition n|(a-b) implies n|(r_1 - r_2). By assumption, r_1 differs from a by a multiple of n, and r_2 differs from b by a multiple of n, hence r_1 - r_2 differs from a-b by a multiple of n. Hence if a-b is itself a multiple of n, r_1 - r_2 will be too. ---------------------------------------------------------------------- I hope what I said in class answered your question about the last sentence of the proof of Proposition 1.3.4, p.26: The equation ba + tn = 1 implies the congruence ba + tn == 1 (mod n), but the summand tn is 0 (mod n), so we get ba == 1 (mod n). (Cf. p.24, paragraph after proof of Prop.1.3.2, last two sentences.) ---------------------------------------------------------------------- You ask how, in the proof of the Chinese Remainder Theorem (p.29), the equation rm + sn = 1 implies rm == 1 (mod n) and sn == 1 (mod m). Well, the equation rm + sn = 1 gives the congruence rm + sn == 1 (mod n), and modulo n, the term sn is congruent to 0, so the above congruence simplifies rm + 0 == 1 (mod n), i.e., rm == 1 (mod n). (Cf. p.24, paragraph after proof of Prop.1.3.2, last two sentences.) Turning the same equation into a congruence (mod m) similarly gives the other formula. => Please remember to include in your question of the day the page to which the question refers (along with information about the part of the page to which it applies). <= I have to respond to questions from the 40 people in this class and the 20 in my graduate class, and it helps if I don't have to spend time paging through the book to locate the points to which questions refer. ---------------------------------------------------------------------- In connection with the Chinese Remainder Theorem (p.29) you ask about solving simultaneous congruences x == a (mod m) x == b (mod n) in the case where the condition (n,m) = 1 does not hold. This can be done if and only if (n,m) | a-b (i.e., a == b (mod (m,n)). Elegant, yes? The term "The Chinese Remainder Theorem" is often used for this more general result. It is not too hard to prove; you might try doing so. But I guess the authors didn't want to devote too much time to properties of the integers before getting into the main topic of groups. ---------------------------------------------------------------------- You ask about the phrase "the same remainder as a" in Definition 1.4.1 (p.32). Did you look up the word "remainder" in the index of the book? When you are unsure of the meaning of a mathematical word that the authors use, you should always check first in the index. If it is not there, or if you can't understand the definition, or how that definition is applied in the given situation, then you should ask about that (saying "I looked up the word in the index, but ..."). ---------------------------------------------------------------------- You ask about the relationship between statements (1) and (3) of Corollary 1.4.6, p.36. To see this relationship, use part (a) of Proposition 1.4.5. That statement gives a criterion for invertibility of [a] in terms of relative primality with n. On the other hand, the first clause of the sentence given as the proof of Corollary 1.4.6 notes the relationship between n being prime, and what integers are relatively prime to n. Put them together, and you have the equivalence you ask about. ---------------------------------------------------------------------- You ask about the matrices in Example 1.4.1, pp.36-37. That example uses the matrix technique of pp.12-13, to express 1 = (11,16) as a linear combination of 11 and 16. (I had the class skip both the material on pp.12-13 and this Example.) ---------------------------------------------------------------------- You ask about the origin of the word "totient" (Def. 1.4.7, p.38). It is from the Latin, meaning "that many times". (Similarly, "quotient" means "how many times".) The logical connection with the definition of phi(n) is weak (there are lots of counting functions in mathematics, and this one has no more right to be called the "totient" than any other); but when people need a new word, they aren't always picky. ---------------------------------------------------------------------- You ask how Proposition 1.4.8 on p.38 is obtained. They say it is proved in Exercises 15 and 27. Actually, it is done in the three exercises 15, 27 and 28. You can try those exercises. If you have difficulty, ask and I will help. ---------------------------------------------------------------------- You ask why the number of elements in Z_n ^x is phi(n), as stated at the bottom of p.38. Recall that the elements of Z_n can be written [0]_n, [1]_n, ..., [n-1]_n, and ask yourself which of these are units, using Proposition 1.4.5(a), p.36. ---------------------------------------------------------------------- You ask, regarding the proof of Theorem 1.4.11, p.39, whether one can say, in the same way, that a_1 . a_2 . ... . a_{phi(n)-1} == a^{phi(n)-1} a_1 . a_2 . ... . a_{phi(n)-1} (mod n), and so that 1 == a^{phi(n)-1} (mod n). Definitely not. The justification for the argument given in the book is that a a_1, a a_2, ... , a a_{phi(n)} are in distinct congruence classes (mod n), and there are phi(n) of them, hence there must be _one_from_each_ congruence class relatively prime to n, so they must be congruent to a_1, a_2, ..., a_{phi(n)-1} rearranged in some way. On the other hand, a_1, a_2, ... , a_{phi(n)-1} and a a_1, a a_2, ... , a a_{phi(n)-1} are lists of fewer than phi(n) elements, so neither of them has one element in each congruence class relatively prime to n (mod n), so we cannot deduce that the terms of one list are congruent to the terms of the other list rearranged. ---------------------------------------------------------------------- In connection with example 2.1.1, p.47, you ask whether, in the case of a function from the real numbers to itself, failure to be defined at some value isn't just a case of discontinuity. That terminology is used in some calculus texts, in particular in Stewart, the text used at Berkeley; but it doesn't match the way mathematicians use the term. So while Stewart would call 1/x a function from the real numbers to the real numbers having an "infinite discontinuity" at x = 0, the correct description is that it is a function R-{0} --> R which is continuous at every point where it is defined, and approaches infinity as x --> 0. (However, Math 113 is not concerned with questions of continuity.) ---------------------------------------------------------------------- You ask how one can distinguish between cases where a formula yields a well-defined function and case where it does not, as in the first two paragraphs of p.49. Given any candidate description of function, if the description involves making choices, even implicitly, one must examine how changing those choices affects the value of the function. If it can make a difference, the function is not well defined; if one can prove that the result is the same for all choices, it is well-defined. When one defines a function on congruence classes [x]_n in terms of the integer x, then there is a choice involved, since any congruence class is the congruence class of many different x; so one has to test the result to see whether the choice affects the answer. Examples of the form f([x]_n) = [mx]_k will not be particularly important in this course; they are given here simply because they provide easy illustrations of well-definedness and non-well-definedness. That set of cases is dealt with exhaustively in Exercise 11, p.55 (not assigned). There will be other cases where functions must be shown well-defined later in the course; in general, the proofs will be given. ---------------------------------------------------------------------- You ask about the relation between Proposition 2.1.5 on p.52 (a function between finite sets with the same number of elements is one-to-one if and only if it is onto) and the similar result in Math 110 about a map between vector spaces of the same finite dimension. The result in Math 110 is not a case of this result, because it only concerns _linear_ maps. One can have non-linear maps between vector spaces of the same dimension that are one-to-one but not onto, or onto but not one-to-one. (For instance, the map non-linear map of the the real line, a 1-dimensional vector space, into itself sending each real number x to e^x is one-to-one but not onto.) The result about linear maps is _analogous_ to the result about set-maps, even though it is not an instance of it. ---------------------------------------------------------------------- You ask how an infinite set S can have a one-to-one correspondence with a proper subset (p.52, Example 2.1.6), since "the proper subset will always have less elements than S". The concepts of "more" and "less" elements don't work the same for infinite sets as for finite sets. The concepts involved in "counting" infinite sets are touched on very briefly in Math 104, and studied seriously in Math 135. Till you take one of those courses, keep an open mind. If you come to office hours (on a Thursday or Friday -- Tuesday is full of people with questions about homework) and ask, I can sketch some of the ideas. ---------------------------------------------------------------------- You ask, in connection with Proposition 2.1.6, p.52, whether both functions f and g have to be one-to-one for the composite g o f to be one-to-one. The answer is "yes and no". To be exact: (a) If f and g are both one-to-one, then so is g o f. (b) If f is one-to-one and g is not, then g o f may or may not be one-to-one. (c) If f is not one-to-one, then g o f cannot be one-to-one. The book has proved (a). You should not find it hard to find examples of both cases in (b) (and thus to prove (b)), and likewise you should not find it hard to prove (c). ---------------------------------------------------------------------- You ask in connection with Proposition 2.1.8, p.53, and the example of the function f(x) = e^x taking R to R which is one-to-one but not onto, "Does this mean we define the function as being f: R --> R^+ as opposed to f: R --> R?" It depends on your goal! If for some reason you need the function to be invertible, and you don't care about changing the codomain, then indeed you should do so. But if you are studying functions R --> R and you don't care about invertibility, then you should keep the codomain unchanged. If you are concerned with invertible functions R --> R, then you just have to reject this one, since it isn't. The book's statement in the middle of p.48 that a similar lack of invertibility of the square root function "can be remedied in one of two ways" may be misleading if it suggests that noninvertibility is something that always needs to be remedied. Don't assume that a non-invertible function needs to be made invertible; but it is worth bearing in mind that _sometimes_ it is of interest to do so. ---------------------------------------------------------------------- You ask for a concrete example of Proposition 2.2.3, p.59. I illustrated it in class; but I'd like to emphasize that in a situation like this, you should look at examples you already know of the relevant concepts, and see for yourself how they apply. The Proposition begins with "Let S be a set, and let ~ be an equivalence relation on S." So you can look back at the examples of sets with equivalence relations on them, choose one that you are comfortable with, and try to apply the Proposition to that example. The first example of an equivalence relation given in the text is Example 2.2.1, p.57, of congruence modulo n on the set Z, and this is a relation we are familiar with from the preceding chapter, so it is a good one to think about. To make it concrete, you might take n = 0, and observe that the equivalence classes are [0]_10, [1]_10, [2]_10, [3]_10, [4]_10, [5]_10, [6]_10, [7]_10, [8]_10 and [9]_10. The proposition says that each x \in Z will be a member of one and only one equivalence class. What does this mean for x = 94 (as a random example)? Well, 94 belongs to [4]_10, but not to [1]_10, [2]_10, [3]_10, [5]_10 etc.; this illustrates the result. If you try something like this, but hit a point that you don't know how to handle, e.g. if you don't know how to interpret some concept in the proposition in the above case, then explain what you are trying to do, and ask about the point you are having trouble with. But as the book says in the top paragraph of p.xi, you should always try to come up with examples. The many examples that the book gives of these concepts should provide most of the examples you need. ---------------------------------------------------------------------- You ask why, in Example 2.2.8, p.61, S/~ and S/pi are the same, though ~ is an equivalence relation and pi is a map. To see that, you have to look at what S/~ and S/pi mean. S/~ means the set of equivalence classes in S under the relation ~ (Definition 2.2.2, p.57). S/pi means the set of equivalence classes in S under a certain equivalence relation obtained from the function pi, as explained in Example 2.2.6 (where a general function called f is used). If you examine the equivalence relation obtained from pi, you find that it is the same as the original ~ , so S/~ and S/pi are the same. ---------------------------------------------------------------------- You say you "start getting confused" in the displayed string of equalities in the proof of Proposition 2.3.4, p.69. Well, the string is made up of steps, and you just have to look at it step by step, and see whether you can justify each step. Since your question mentions the step "sigma(tau(a_j)) = sigma(a_j)" I will explain how to figure out why that step holds, and hope you can work out the rest. Since the "sigma" is the same on both sides of the equation, it appears that what is happening is that tau(a_j) is being rewritten a_j. So we ask ourself, is tau(a_j) equal to a_j? This will be so if a_j is one of the elements that tau does not move. Now a_i is one of the terms that sigma cyclically permutes; and sigma and tau are disjoint cycles, so any term that sigma moves, tau does not move. Hence we do have tau(a_j) = a_j, and so sigma(tau(a_j)) = sigma(a_j). That is the type of thinking you have to do when reading a math book! ---------------------------------------------------------------------- You ask about the authors' distinction on p.70 between arithmetic, algebra, and abstract algebra. I would state it this way: "Arithmetic": computation of sums, products etc. of particular numbers "arithmetic algebra": general results about numbers, under the operations of addition, multiplication, etc. "Abstract algebra": results about general system of elements (not necessarily numbers) given with operations (which may or may not be called "+", "." etc.) Goldbach's Conjecture and Fermat's Last Theorem which you mention belong to the field of Number Theory (the study of arithmetic properties of the integers), and this comes under the heading I have called by the ad hoc term "arithmetic algebra" above. Number Theory is not as general in the subject of its study as is "Abstract Algebra", but that doesn't mean it is less "advanced" than areas of mathematics such as group theory. Incidentally, the goal of studying the integers leads number theorists to use a large number of related systems as tools; so though the original questions they began with are about integers, they do a lot of work with more general systems. ---------------------------------------------------------------------- You ask whether our base 10 system of arithmetic, based on our having 10 fingers, is fundamental to the properties of numbers that mathematicians study. Certainly not -- any more than the language they happen to speak underlies the results they express using that language. Incidentally, though a large fraction of the world's languages use base 10, not all do. The ancient Babylonians used base 60 -- or more precisely, like us they counted up to 10, and like us they had words for "20", "30", etc., but they stopped at "60" and used it, as we use 100, for their next unit. Then 10x60, then 60x60, etc.. I believe some Central American Indian languages use 15, and I have heard that many languages native to New Guinea use numbers around 27: When they count, they don't count one finger after another, but point to the knuckles on one hand, then the wrist, the elbow, the shoulders, and down the other arm, ending with the knuckles of the other hand. The exact number of joints they point to, and hence the base of their system, apparently varies from tribe to tribe. ---------------------------------------------------------------------- You ask how to apply Proposition 2.3.8, p.72, in the case when sigma is itself a cycle. To apply the proposition, we have to decide what is meant by the l.c.m. of a single positive integer m. Admittedly, the authors haven't defined this (thanks for pointing this out; I will mention it to them when I write them at the end of the semester). But when one thinks of what it should mean -- "the least integer that is a multiple of all members of the 1-element set {m}" -- it is clear that this is just m itself. And indeed, the authors note in the first sentence of the proof that in this case, the order of sigma is exactly m. ---------------------------------------------------------------------- You ask whether there is a name for a pair (S, *) where * is a binary operation on S that does not satisfy all the conditions to be a group. Yes, there are lots of names, depending on just what conditions are satisfied. If * is assumed associative, the pair is called a "semigroup". If it is associative and has a neutral element, it is called a "monoid". Those two are the most studied cases. There have been many names proposed for a pair (S, *) with no condition assumed on the binary operation (in particular, no assumption of associativity): "magma", "groupoid", "binar" -- I prefer "binar". If one also assumes that for every a, b \in S, the equations a*x = b and x*a = b each has a unique solution, the object is called a "quasigroup", and a quasigroup with a neutral element is called a "loop". And there are still more ... . All of these together are not studied as much as groups, and all the nonassociative structures together are not studied as much as semigroups and monoids; but they are each of some importance. ---------------------------------------------------------------------- You ask how, in the equation 0.a = 0 in the last paragraph of p.86, the first "0" can be an integer and the second a group element. It's simply a case of one symbol having two different meanings. To distinguish them, we could write the equation as 0_Z . a = 0_G, with the subscript "Z" on the first "0" meaning that it represents the zero of the set Z of integers, and the subscript "G" likewise showing that the zero of the group G is meant. But that isn't needed, because in any group G written additively, we don't write x . y except when x is an integer and y a member of the group, so that one can tell that in "0 . a", 0 must mean the integer. You also ask "How can 0 be the identity at all if zero times any number is not that number but 0 ?" 0_Z is the identity of (Z, +), because 0 + n = n for every n. On the other hand, it is not the identity of (Z, .), for the reason you state. The convention is that in a group written _additively_ (as (Z, +) is) the identity element of the group is denoted 0. ---------------------------------------------------------------------- You asked whether the concepts of "subspace" in Math 110 and "subgroup" in this course (Definition 3.2.1, p.92) are related, or perhaps the same. Related, yes. The same, no! If V is a vector space, then (V, +) is a group. In this situation, every subpace of V is a subgroup, but not every subgroup is a subspace. For example, letting V = R^2, the subset Z^2 is a subgroup, but is not a subspace because it is not closed under multiplying by all real numbers (scalars). In virtually every branch of mathematics, for any kind of object studied -- groups, vector spaces, rings, fields, modules, lattices, monoids, etc. etc. etc. -- there is a concept of a "subobject"; and in general, these correspond to subsets of the original objects that satisfy appropriate "closure" conditions. The concepts of subspace of a vector space and subgroup of a group are both instances of that general pattern. You also asked about the fact that by Proposition 3.2.2(iii), a subgroup contains g^-1 for every element g that it contains, but that a subspace will not containing "0^-1". In this connection, remember that whenever we have a result about groups stated in multiplicative notation, if we apply it to a group that is written additively we have to adjust the notation! So when Proposition 3.2.2 is applied to a vector space, we must change "ab" to a+b, "e" to 0, and "a^-1" to -a. Then we see that it is true that every subspace will contain -0, since this is the same as 0. ---------------------------------------------------------------------- You ask about the use of the subset symbol for subgroups (p.92), given that a group is formally defined as an ordered pair, while the subset relation logically applies only to the first member of the pair, the underlying set. You are right. I can only say that most mathematicians don't even bother about defining a group as an ordered pair; they just consider a group to be a set "given with" an operation, where what "given with" means isn't made explicit (though if pressed, they would probably all agree that a formal definition should make a group such an ordered pair -- here I ignore the question of whether the inverse operation and the identity should also be included as part of the group structure, since that isn't what we're talking about.) So Beachy and Blair are to be commended on at being precise to that extent. Of course, when they write the subgroup relation with the inclusion sign, they are really referring to inclusions among the first components of the pairs. Incidentally, group-theorists generally write "<" for "subgroup of". However, they are not so much concerned with correcting the problem you have brought up, as with using one symbol to say simultaneously that it is a subset, and that is closed under composition and inverses. They also write "<|" for "normal subgroup of" (i.e., a sideways triangle. Whether what I wrote looks like a triangle depends on your machine's fonts.) ---------------------------------------------------------------------- In your question you complained about the use of word "cyclic" (Definition 3.2.5, p.95) in the infinite case, which "is just going to confuse everybody". But lots of words get used in ways very different from their original meaning, and once people learn the current meaning, this is not confusing. "Orientation" originally meant "finding out which way is East" (i.e., the orient). "Graph" originally meant "anything written". "Size" is short for "assize", which meant "a session (of parliament etc.)", and if I understand the history correctly, "assize measures" then came to mean weights and measures that were fixed by Parliament, which eventually gave our word "size" for "how big something is" ... . ---------------------------------------------------------------------- You ask about the notation "n a" in Exercise 3.2.7, p.96. This notation is explained in the last three paragraphs on p.86. The key point is that in a group written multiplicatively, a. ... .a is written a^n, while in a group written additively, a+ ... +a is written n a. I.e., when multiplication is _not_ used to write the result of applying the group operation to two elements, it is used to count how many times the same element is added to itself. This is, after all, the same notation as is used with numbers: a. ... .a = a^n, a+ ... +a = n a. The key fact one needs to learn here is that when a definition or result about groups which was stated in multiplicative notation is applied to a group written additively, one must everywhere change "a.b" to "a+b", "a^-1" to "-a", and "a^n" to "n a". So reread those last three paragraphs of p.86. ---------------------------------------------------------------------- You ask how one can know, when a group is referred to, what operation is intended. That is an important point! The answer is that you need to look back at the definition of whatever group is being talked about. For instance, the definitions of the groups R of real numbers, Z of integers, and Z_n of integers mod n all make the operation "+", while the definitions of R^x and Z_n ^x make the operations multiplication, and finally, the definition of the groups S_n make the operation functional composition. As the authors say, a group is, strictly speaking, a pair (G, .) consisting of a set and an operation. But informally, mathematicians treat a group as a set G for which an operation is "given". This operation is named in the definition and should be learned with that definition, and understood to be the operation of the group when the contrary is not stated. On rare occasions, more than one binary operations are considered simultaneously on the same set. In such situations, an author must explicitly refer to "the group (G, .)" versus "the group (G, *)", or "the group G with operation ." versus "the group G with operation *", rather than just "the group G". But in most situations, one can simply understand "G" to mean "the group G with the operation that was described when G was defined". ---------------------------------------------------------------------- You write about Example 3.2.10, p.97 "if S_3 denotes the set of all permutations, how does it fit into the cyclic definition?" I can't really tell what you're asking. What a student should do on seeing an example about S_3 and the property of being cyclic is to apply that definition to each element a\in S_3 for him or her self, see what subgroup this gives, and see whether for any of the choices of a that subgroup is all of S_3 ! Did you do that? One doesn't learn to play baseball, or do a martial art, or play a musical instrument just by watching other people do it. The same is true of learning mathematics. To understand the definition of cyclic subgroup and cyclic group, you need to apply it yourself. The only case where you could fairly say that this example was "confusing" is if you tried to apply the definition yourself, and what the book said didn't seem to match what you got when you did it, and you couldn't figure out why. Then it would be reasonable to ask to have it explained. ---------------------------------------------------------------------- You ask about the term "diagonal subgroup" in Example 3.3.4, p.107. To see why it is so called, graph the first few elements of this subgroup, (0,0), (1,1), (2,2) etc., as points on graph paper. ---------------------------------------------------------------------- You ask about the statement in Example 3.4.5, p.116, that in GL_2(Z_2), / 1 1 \ 2 \ 1 0 / is the identity matrix (referring in turn to p.109), and similar calculations. The thing to remember is that the "0" and "1" which are entries of these matrices mean [0]_2 and [1]_2, and that, as discussed in section 1.4 (in particular, Table 1.4.1 p.33) [1]_2 + [1]_2 = [0]_2. ---------------------------------------------------------------------- You ask about proving a map onto when proving it an isomorphism, in particular in connection with Example 3.4.5, p.117. Note first that that example _does_ refer implicitly to the map being onto. It says that "phi is a one-to-one correspondence", and the last sentence of Definition 2.1.4, p.51, says that a the term "one-to-one correspondence" (as distinct from "one-to-one map"!) means a map that is one-to-one _and_onto_. As they say in Example 3.4.5, the remarks about the unique forms of respective elements show that phi is a one-to-one correspondence. Specifically, to show that it is onto, take any element x\in GL_2(Z_2), write it as a^i b^j using the matrices a, b shown in that example, and let y = (1,2,3)^i (1,2)^j \in S_3. Then we see that phi(y) = x, as needed. In general, to show a map is onto one has to look at a general element of the codomain and show that there is an element of the domain that maps to it. Sometimes Proposition 2.1.5 allows one to shortcut this, and get a quicker proof. (That would have been another way to do this case, after proving phi was one-to-one.) ---------------------------------------------------------------------- Concerning the argument at the end of the first paragraph of the proof of Theorem 3.5.1, p.120, you point out that if is infinite cyclic then the inverse of an element a^j with j>0 is not an element a^k with k>0, and you ask how the authors justify the final assertion of that paragraph. What they mean is, "If n>0 take n = k; if n < 0, take n = -k. In the first case, a^k \in H by assumption; in the second, a^k \in H because (a^n)^-1 = a^k. So in either case, we have a positive integer k such that a^k \in H." That is, they take they take it for granted that the reader sees that if n is positive, one has the desired result, and that they only need to explain how to get it if n is negative. ---------------------------------------------------------------------- You ask, with respect to Definition 3.5.6, p.124, what kind of group G has no exponent. There are two ways this can happen. If G has an element of infinite order, then (whether or not it has elements of finite order too), it will have no exponent. On the other hand, if every element of G has finite order, G can still fail to have an exponent if the set of orders of elements of G is not bounded; e.g., if it has elements of orders 2, 4, 8, 16, ... . An example of the first case is, as you note, Z. I can give you an example of the second case, but you would have to sit down and think about it to see this property, so I don't want to do so unless you say you are ready to make that effort. Let me know if you want one or both of these examples. ---------------------------------------------------------------------- In connection with Definition 3.5.6, p.124, you wanted an example of an infinite group having an exponent. Fix an integer n > 1, and let G be the set of all infinite sequences (x_1, x_2, x_3, ... ) of elements of Z_n. For group operation, let (x_1, x_2, x_3, ... ) + (y_1, y_2, y_3, ... ) = (x_1 + y_1, x_2 + y_2, x_3 + y_3, ... ). Then it you can verify that G is a group of exponent n; but it is an infinite group, since there are infinitely many possible choices of such sequences. (Incidentally, this is what is called an infinite direct product, Z_n x Z_n x Z_n x ... .) ---------------------------------------------------------------------- You ask about the verification that (lambda_a)^-1 = lambda_{a^-1} in the proof of Cayley's Theorem, p.127. In your calculation you treat (lambda_a)^-1 as the function taking x to (ax)^-1; but this is not what it means. It means the _inverse_function_ of the function lambda_a. For the definition of "inverse function" see p.53, Definition 2.1.7. The distinction is important, so you should make sure you have it right. As another example, in Math 1A, the inverse function to f(x) = x^3 is the cube root function, not the function giving 1/x^3. ---------------------------------------------------------------------- You ask about the way the lattice of subgroups of S_3 is drawn on p.132. The standard convention for drawing such lattices is that a line going upward (whether straight up or at an angle) indicates that the subgroup at the lower end is contained in the one at the upper end. In general, such a line is drawn unless there is a subgroup in between, in which case it is unnecessary. E.g., there is no line from {e} to S_3 in that diagram, because the fact that {e} is a subgroup of S_3 is implied by the facts that it is a subgroup of each of the intermediate groups (as shown by the lines going up from {e}), and each of them is a subgroup of S_3 (also shown by lines). Within those constraints, the group and subgroup can be arranged in any way. On p.132 the authors have chosen to show three of the intermediate subgroups as "lower" than the remaining one because they have fewer elements, and to put them together at one side, because there would be no way of arranging them around that other subgroup that reflected any logical relationship among these. But these were their choices, not dictated by any rules. ---------------------------------------------------------------------- You ask how, at the bottom of p.133, the application of the permutation (1,2,3) to Delta_3 = (x_1 - x_2) (x_1 - x_3) (x_2 - x_3) gives (x_2 - x_3) (x_2 - x_1) (x_3 - x_1). The key to this is the statement four lines above the bottom on that page, "Any permutation sigma \in S_n acts on Delta_n by permuting the subscripts". Now remember that if we take sigma = (1,2,3), this means that sigma is the function defined by sigma(1) = 2, sigma(2) = 3, sigma(3) = 1. Hence, letting sigma act on the subscripts, i.e., letting sigma take each x_i to x_{sigma(i)}, we see that sigma(x_1) = x_2, sigma(x_2) = x_3 and sigma(x_3) = x_1. Applying sigma to the whole polynomial Delta_3 now has the effect the authors note. ---------------------------------------------------------------------- You ask whether there is a less computational proof of Lemma 3.6.6, p.134. How about this one? The result is easy for transpositions of the form (h,h+1), since these have the effect of sending x_h - x_h+1 to its negative, while taking all other differences x_i - x_j with subscripts in increasing order (i.e., i < j) to differences still having subscripts in increasing order. Thus, (h,h+1) Delta_n = - Delta_n. Now suppose inductively that for some h < k we have (h,k) Delta_n = - Delta_n, and, assuming k < n, let us prove the same for (h,k+1). It is easy to verify that (h,k+1) = (k,k+1) (h,k) (k,k+1). Hence when (h,k+1) is applied to Delta_n, the effect is to change its sign three times. So (h,k+1) Delta_n = - Delta_n, as required. ---------------------------------------------------------------------- You ask about the proof of Proposition 3.7.2(c), p.138 i.e., phi(a^n) = (phi(a))^n. One proves it for all positive integers n as follows: It is clear for n = 1. Suppose inductively that it is true for n = k, i.e., that phi(a^k) = (phi(a))^k. Then phi(a^{k+1}) = phi(a^k . a) = phi(a^k) . phi(a} = (phi(a))^k . phi(a} [this step by the inductive assumption] = (phi(a))^(k+1} Hence it is true for all positive integers n. The case n = 0 follows from part (a) of that Proposition, and the case of negative n follows from the case of positive n using part (b). ---------------------------------------------------------------------- You ask what is meant by phi(H_1) in Proposition 3.7.6(a), p.140. It means {phi(h) | h \in H_1}. This is the notation for the image of a subset under a map, defined in the display in the middle of p.61. ---------------------------------------------------------------------- You ask what is meant, in Proposition 3.7.7, p.141, by "natural mapping". It is what the authors called the "natural projection" on p.61, Example 2.2.8. ---------------------------------------------------------------------- You ask where the proof of Theorem 3.7.8 (p.142) uses the fact that phi is a group homomorphism. Good question! If a proof of a result doesn't seem to use one of the assumptions, there's something fishy, and one should look into it. In this case, as I said in class, the authors spent most of the proof repeating the proof of Theorem 2.2.6, which is true for sets, and doesn't use the fact that phi is a homomorphism. They only consider the group-theoretic properties in the last sentence of the proof, and the fact that phi is a homomorphism is used exactly once: In the middle of the display in that sentence, at the step phi(a)phi(b) = phi(ab). Anyway, that gives what they need to complete the proof. Combining the fact that phi-bar is a homomorphism, proved in that sentence, with the fact that phi-bar is one-to-one and onto, proved in the preceding parts of the proof, one concludes that phi-bar is an isomorphism. ---------------------------------------------------------------------- You ask whether the similarity of conditions (2) and (4), and likewise of conditions (3) and (5) in Proposition 3.7.9, p.143 implies that G_1 is commutative. No. The a b^-1 of condition (2) and the b^-1 a of condition (4) are in general different elements; but the Proposition tells us that if one of those elements is in ker phi, then so is the other. In statements (2) and (4), the "k"s likewise represent different elements of Ker phi. (In fact, if you think about it you will see that the "k" of (3) is a b^-1 and the "k" of (5) is b^-1 a). I recommend that you examine a sample case. Let phi be the homomorphism S_3 --> Z_2 that takes even permutations to [0] and odd permutations to [1]. Figure out what Ker(phi) is; then look at some elements a and b of S_3 that don't commute, and for each pair, check whether a b^-1 belongs to Ker(phi), and whether b^-1 a belongs to Ker(phi). You will see that whenever one does the other does, and whenever one doesn't the other doesn't. You also ask why we need all these conditions. One reason is that in some situations it is more convenient to test whether a b^-1 belongs to ker(phi), in other situations, whether b^-1 a belongs to phi, etc.. Also, the fact that the conditions are equivalent will help us prove various facts later on. ---------------------------------------------------------------------- You ask whether the elements k in conditions (3) and (5) of Proposition 3.7.9, p.143, are the same. No. Since each statement contains the phrase "for some k\in Ker(phi)", the "k" is a bound variable in each statement, and the truth of each statement is checked by letting k run over all possible variables in each case. If some value of k makes the equation in (3) hold, then statement (3) is true, if some value of k makes the equation in (5) hold, then statement (5) is true, and these values do not have to be the same. As you note, if G is not commutative, they usually won't be the same. (Lucky we had the reading on bound variables in just time for your question!) ---------------------------------------------------------------------- You ask about the book's assertion on p.146, second paragraph, that in proving Lagrange's Theorem it was shown that the elements of [a] are the elements of the form ha for h\in H. The second paragraph of the proof of Lagrange's Theorem (p.99) shows that the set [a] is the image of H under the map rho_a; i.e., that [a] is the set of elements rho_a (h) for h\in H. Now by the first line of that paragraph, rho_a (h) = ha; so the set of elements rho_a (h) is just the set of elements ha. ---------------------------------------------------------------------- You ask how in Theorem 3.8.4, p.150, N is an identity element. You have to think about what they are saying it is the identity element of! It is not an identity element of G, certainly. But the object G/N (which they are proving is a group) has cosets of N as its elements, and has multiplication defined as in Proposition 3.8.3. Now N is a coset of N (namely, N = eN), so it is indeed an element of this object. To see whether it is an _identity_ element of the object, you have to apply the definition of multiplication of G/N, and see whether it satisfies the identity law. ---------------------------------------------------------------------- You ask about the significance of the statement "H = pi^-1 pi (H)" on p.151, first paragraph. Remember that a symbol "f^-1" can mean two different things: If f is an invertible function, f^-1 can be the name of the inverse of that function, while if f is any function, f^-1 applied to a _set_ X means "the set of elements that are mapped by f into X". It is the latter meaning that the symbol pi^-1 has here. To take an example from a more familiar area, suppose f is the function on real numbers given by f(x) = x^2. Then f^-1({9}) = {3, -3}. Hence f^-1 f ({3}) is not the same as {3}; it is the larger set {3, -3}. In other words, a set X can expand when we form f^-1 f (X). So a nontrivial discussion is required on p.151 to show that the set H does not expand when pi^-1 pi is applied to it. ---------------------------------------------------------------------- You ask what is meant by "set theoretic product" in Proposition 3.8.7, p.151. As noted at the beginning of the paragraph preceding the proposition, the authors mean the multiplication of subsets of a group defined in Definition 3.3.1. "Set theoretic" is not really a good word for it; they mean the element-by-element multiplication they have defined on arbitrary subsets of G, rather than the special multiplication they have defined on cosets of N. (In any case, the next step is to show that when the general product is applied to cosets of a normal subgroup, it really does give the special multiplication defined on the preceding page.) ---------------------------------------------------------------------- You ask why, in the statement on the top line of p.153, phi-bar is onto. To say a map is onto is to say that its image equals its codomain. What is the codomain of phi-bar ? Unfortunately, the authors haven't been explicit about this; but looking at the statement of the theorem, you can see that they are trying to construct an isomorphism of G_1 / K with phi(G_1), and from the proof, you can see that phi-bar is supposed to be this isomorphism; so you can gather than the codomain of phi-bar is supposed to be phi(G_1) (a subgroup of G_2; rather than the whole group G_2). Once you know this, can you see from its definition that phi-bar is onto, i.e., has image exactly phi(G_1) ? You comment, in general, that the book says that things are "clearly true" when they are not clear to you. When a mathematician says something is clearly true, this means that if you think about what conditions must be satisfied for it to be true (and what facts have been shown), you will find that those conditions are satisfied. In other words, it takes for granted that you will do your part of the work. In the above case, they slipped, and forgot that they hadn't said explicitly what the codomain of phi-bar should be. But in general, I think they use the phrase appropriately. If you notice more examples where, after first "doing your part of the work", you find that what they call clear is not, point them out to me. ---------------------------------------------------------------------- You ask why, as stated on p.153, second paragraph, a homomorphism phi is one-to-one if and only if its kernel is trivial. This follows from Proposition 3.7.9 if you remember that a subgroup is called "trivial" if it equals {e}. (It was also proved earlier, as Proposition 3.4.4, before we had the words to say it this way.) ---------------------------------------------------------------------- You ask whether the "x" in the definition of F[x] (p.164) belongs to F or to another set. Not to F; it is an additional element, outside x, which has been brought in to generate, together with the elements of F, the ring F[x] of polynomials. (However, when one evaluates a polynomial f(x) at some c\in F, one substitutes an element of F for x.) ---------------------------------------------------------------------- You ask whether the "x" in a polynomial f(x) (Definition 4.1.4, p.164) is a member of F. Definitely not! It is a new element that has been adjoined to F in forming the polynomial ring. Given a polynomial f(x), one can substitute any element c\in F for x to get an element f(c)\in F; but f(x) itself is not in F. ---------------------------------------------------------------------- You ask why Proposition 4.2.5, p.176, assumes gcd(p(x),f(x)) = 1 rather than gcd(p(x),f(x)) = k for any nonzero constant k. If a polynomial d(x) has the properties (i) and (ii) of Definition 4.2.3, then so does every polynomial we get by multiplying d(x) by a nonzero constant. Hence, to choose one out of all these polynomials to call "the" g.c.d. of f(x) and g(x), one insists on the _monic_ polynomial satisfying (i) and (ii). That is the reason for the word "monic" at the start of that definition. The only monic constant polynomial is 1. ---------------------------------------------------------------------- You ask how, in the proof of Proposition 4.3.1, p.182, we know that r | a_0. Because (as indicated earlier on the same line) r| a_0 s^n, and (as noted at the end of the line) (r,s) = 1. Hence Proposition 1.2.3(b) is applicable. ---------------------------------------------------------------------- You ask how, in the proof of Theorem 4.4.6, p.190, the properties of addition and multiplication in F[x] imply the same properties for F[x]/. As I said in class, the authors neglect to state how they are defining operations on F[x]/; but in complete analogy with the way operations were defined in Z_n and in factor-groups G/N, the definitions are [a(x)] [b(x)] = [a(x) b(x)] [a(x)] + [b(x)] = [a(x) + b(x)] Proposition 4.4.4 shows that these operations are well-defined. Once one has the operations defined in this way, the desired properties of these operations follow immediately from the corresponding operations of F[x]. For instance, in the display in the proof of Theorem 4.4.6, they verify commutativity of addition in F[x]/ using commutativity of addition in F[x]. The first and last steps of that display are the applications of the above definition of multiplication of congruence classes; the middle step uses commutativity of multiplication in F[x]. All the other laws are deduced in the same way. You should take some law, such as distributivity, and go through the calculation yourself, to check that it works. ---------------------------------------------------------------------- You ask whether the construction of the field of complex numbers you learned in High School is rendered obsolete by the construction of Example 4.4.2, p.191. No. One of the points of the concept of isomorphism is that if two objects are isomorphic, then mathematicians don't have to decide which one to attach a given name to; as long as they agree that "A" means a system that has the relevant properties, one of them can be thinking of A as meaning a system constructed one way and the other can be thinking of A as meaning a system constructed another way, and yet they can talk about these systems without any disagreement. So if one mathematician thinks of the complex numbers as the set of symbols a + bi, and the other thinks of them as R[x]/, that's no problem. And in fact, most mathematicians are satisfied with thinking of C as meaning "a field generated over R by a root of x^2 + 1 called i", happy to have seen several ways of constructing such fields, and not feeling obliged to choose a particular one. Different constructions of such fields are preferable for different purposes. When one is not considering the broader context of general field extensions, it is not worth developing the machinery of factor-rings of polynomial rings, and one simply uses the "a+bi", or "(a,b)" description -- both in High School math and in Math 185. But for algebraists, the description as R[x]/ is preferable because it is an instance of the general construction F[x]/. ---------------------------------------------------------------------- You ask whether the u of Theorem 4.4.8 (p.191) might belong to F. If the polynomial p has degree 1, then u will belong to F, since a polynomial of degree 1 has a root in the given field. (In that case, E = F[x]/ will come out isomorphic to F; so, after we make our identification of F with its image in E, we will have E = F.) If p has degree > 1, then, being irreducible, it doesn't have a root in F, and u will lie outside of F. You are right that if p already has a root in F, there is no need to construct the extension F[x]/. But in giving the proof of the theorem, it's shorter to do things in one way for all cases rather than saying "If p has degree 1, take E = F; if p has degree > 1, let us construct E as follows." ---------------------------------------------------------------------- You ask how one proves that Q[x]/ is isomorphic to Q(sqrt 2) (Example 4.4.3, p.192). By the same method used in Example 4.4.2 to show R[x]/ is isomorphic to C. The authors' approach is to go into details the first time they show you something, but assume that when they give you another instance of the same thing, you can provide those details yourself. ---------------------------------------------------------------------- You ask whether a ring (defined on p.197) can have more than one multiplicative identity element. No. Proposition 3.1.2(a) (p.81), applied to the multiplication operation of the ring, shows that it cannot. ---------------------------------------------------------------------- You ask about the usefulness of Proposition 5.2.3(c), p.212, which says that the image of 1 under a ring homomorphism is an idempotent element. As I said in class yesterday, I think that a more natural concept of homomorphism of rings with identity would add the condition that such maps take the identity to the identity. Homomorphisms which don't have that property, though allowed by the authors' definition, will not come up much in this course. However, given that their definition does allow such maps, Proposition 5.2.3(c) is useful pedagogically in two ways: First, it is implicitly a warning against assuming that a homomorphism f must always send the identity to the identity (especially needed since we have learned that a homomorphism of groups must do so). Secondly, it is a useful lesson in reasoning about homomorphisms: The condition defining the multiplicative identity 1 of R relates it to all other elements of R. Such conditions need not be preserved under a homomorphism, since elements of S that are not images of elements of R may not satisfy the appropriate equations relating them to the image of 1. On the other hand, the property of being an identity implies 1^2 = 1, and this condition (idempotence) will be preserved by the homomorphism, i.e., it will also be true of the image of 1. So we cannot say that that image will be the identity of S, but we can say that it will be idempotent. ---------------------------------------------------------------------- You ask, regarding Definition 5.2.4, p.213, whether, if phi: R --> S is a homomorphism of rings with unit, one has a multiplicative analog of the "kernel". One can define the set {r\in R | phi(r) = 1}, but this will not be useful in the same way the additive kernel is. Recall that if phi is a homomorphism of groups that are written multiplicatively, then phi(x) = phi(y) <=> phi(x^-1 y) \in Ker(phi). But for elements x and y of a ring, x^-1 y does not in general make sense, hence such a criterion is not available. On the other hand, if we start with the fact that in a group written additively, phi(x) = phi(y) <=> phi(x - y) \in Ker(phi), this can be applied in a ring R, because a ring is a group under addition, and in particular, x - y makes sense. So we use additive kernels rather than multiplicative kernels in studying ring homomorphisms. ---------------------------------------------------------------------- You ask about the meaning of the equation "n . 1 = 0" in Definition 5.2.9, p.216, defining the characteristic of R. The n is an integer, but the "1" denotes the identity element of R. We can multiply an element of a ring by an integer because a ring is a group under addition, and on p.86, last three paragraphs, the authors define the product of an element of an abelian group and an integer. ---------------------------------------------------------------------- You ask why the authors call the description of the characteristic begun at the bottom of p.216 "more sophisticated" than that of Definition 5.2.9. The definition as it is given is rather arbitrary -- people found that a certain number was useful in studying a ring, and they called it the "characteristic". The re-definition shows why that number is important and what information it gives; namely, since it determines the kernel of a natural homomorphism Z --> R, it determines the structure of the image of that homomorphism, i.e., the structure of the set of "integers" of R. ---------------------------------------------------------------------- You ask about the statement following Example 5.3.2, p.220, that for R a general commutative ring, we should not expect R[x] to be a principal ideal domain. Well, Example 5.3.2 used Theorem 4.2.2, a result proved about F[x] for F a field, but not proved about R[x] for R a general commutative ring. Could one make the same proof work for R a general commutative ring? That proof uses Theorem 4.2.1, so to make it work, we would want to have Theorem 4.2.1 true for R[x] where R is a general commutative ring. So look at the proof of Theorem 4.2.1. Does it _just_ use addition and multiplication, i.e., the ring operations of F? If so, we can expect that it will work with F replaced by an commutative ring. Or does it also use the inverse operation, which does not exist for a ring that isn't a field? It does. So unless you can find a way to prove Theorem 4.2.1 without using division, you can't conclude that R[x] will be a principal ideal domain for R an arbitrary ring. At that point, this doesn't say whether R[x] is or isn't a principal ideal domain -- it just says that the way we know F[x] is one when F is a field does not work for a general ring R, so we have to suspend our judgement. Then the authors point you to an exercise in which the question is answered -- in the negative. ---------------------------------------------------------------------- You ask how, in proving Proposition 5.3.7, p.222, the authors can use pi^-1 without having shown pi to be one-to-one and onto. In general, the map pi is not one-to-one, so it does not have an inverse, so the symbol pi^-1 does not mean "the inverse of the function pi". If f: X --> Y is any function, whether or not one-to-one or onto, and S is a subset of Y, then one defines f^-1 (S) to mean {x\in X | f(x)\in S}. This is called the "inverse image of S under f". It is a different meaning of the symbol f^-1 from the meaning one gives this if f is invertible. ---------------------------------------------------------------------- You ask how, in proving Proposition 5.3.7(b), p.222, they get pi(J) = {[a] | a\in J} and pi^-1(J) = {a | [a]\in J}, and why "ideals are preserved". The map pi is defined in statement (a) of the Proposition, and the two equations above are applications of this definition. The authors also call on Proposition 3.8.6. Did you check what that Proposition says, and see what application the authors are making of it here? The difficulty with applying it is that Proposition 3.8.6 refers to subgroups, and not every subgroup of a ring is an ideal. So they have to show that when the correspondence of Proposition 3.8.6 is applied to an ideal of R, the resulting subgroup of R/I is also an ideal, and vice versa. That is what they mean by the correspondence preserving ideals. To follow their proof that it does preserve ideals, you need to compare the definitions of "subgroup of R" and "ideal in R" and see what extra property a subgroup has to satisfy to be an ideal. When you have done this, you will see that this property is exactly what they check in their calculations. ---------------------------------------------------------------------- You ask, in connection with the last line of the proof of Theorem 5.3.10, p.225, why 1\in J implies J = R. Since J is an ideal, 1\in J implies that for every r\in R we have r.1 \in J, i.e., R\in J, so J contains all elements of R. ---------------------------------------------------------------------- You ask how the last displayed equation on p.229, (ad+bc) b'd' = (a'd'+b'c') bd, is relevant to showing addition is well-defined. I assume that you followed the computation up to that point; in particular, that you see why showing addition is well-defined is equivalent to showing [ad+bc, bd] = [a'd'+b'c', b'd']. Now [ad+bc, bd] and [a'd'+b'c', b'd'] denote the equivalence classes of the pairs (ad+bc, bd) and (a'd'+b'c', b'd'), so to say they are equal is to say that (ad+bc, bd) ~ (a'd'+b'c', b'd'). Now if you go to the definition of "~" at the top of the page, and apply it to this case, you will see that the condition for this to hold is exactly the equation you asked about. ---------------------------------------------------------------------- You ask about the meaning of "embedding" on p.232, next-to-last line before the exercises. It means "one-to-one homomorphism". The idea is that if a homomorphism f: A --> B is one-to-one, then f(A) is isomorphic to A, so we can think of f(A) as a copy of A "embedded" in B. ---------------------------------------------------------------------- You ask about the relationship between the concept of embedding (p.232), and the "surgery" I referred to earlier, when discussing the idea of identifying R with a subring of R[x]. An embedding means a one-to-one homomorphism. So it is in the situation where one has an embedding that one may want to use what I called "surgery" (which is not a standard term, but just my attempt to make the idea vivid). If phi: A --> B is an embedding, then (assuming A and B have no elements in common) one can form the set B - phi(A) \union A, which looks like B, but where the image of A under phi has been replaced by A itself; and make this set a ring (or group, etc.) in such a way that the elements a \in A play the roles previously played by the elements phi(a) \in B. This new ring will be isomorphic to B, but will have A literally contained in it, rather than having a subring isomorphic to A. I hadn't thought of this, but I realize that when mathematicians use the word "embedding", it is with this "surgery" or some substitute for it (e.g., an identification of A with f(A) \subset B), in mind, since the metaphor is that of "putting A into B". But the idea has been assimilated to the way it is used, so that now when mathematicians say "embedding", they don't really think "We get A inside B", but rather, "We get an isomorphic copy of A inside B", since an isomorphic copy of something is as good as the original for practical purposes. The one way an isomorphic copy of something is not quite as good as the original is notationally -- it's cumbersome to have to write phi(a) in place of a in one's calculations. It is for this reason that mathematicians talk of "identifying" A with its image in B, so that they can write "a" instead of "phi(a)". ---------------------------------------------------------------------- You ask whether, given an algebraic element u in an extension field of K, there can be more than one polynomial satisfying the definition of the "minimal polynomial of u" (Definition 6.1.4, p.236). No. Proposition 6.1.3 states the uniqueness of that polynomial. However, I'm not sure the authors have really proved uniqueness. The earliest point I have found where they assert it is in Example 5.3.7, p.225, where they "summarize the information we have"; but this information presumably comes from Theorem 4.2.2, p.174, which does not assert uniqueness. Anyway, the uniqueness of the monic generator of any nonzero ideal of K[x], and hence in particular, the uniqueness of the minimal polynomial of an element of an extension field, is easy to see, in two different ways: (1) If = , then each of f(x) and g(x) must be a multiple of the other, hence one gets one from the other by multiplying by an invertible element of F[x], i.e., by a nonzero element of F; so say g(x) = c f(x). If f(x) is monic, then the coefficient of the highest degree term of c f(x) will be c, so if g(x) is also monic, c must be 1; i.e., f(x) = g(x). (2) We know that any generator of an ideal of F[x] must be a nonzero element of lowest degree, so if f(x) and g(x) both generate I they must have the same degree. If they are also both monic, then their highest-degree terms are exactly the same, so if f(x) - g(x) were nonzero, it would be a nonzero element of I of lower degree, contradicting the assumption that f(x) and g(x) were of the lowest degree. So again, f(x) = g(x). ---------------------------------------------------------------------- You ask, in connection with the comments on p.236, whether there is a general method to determine whether a given real number is transcendental. The trouble is that there is no "general" way of specifying a real number. The definition of the number pi comes from geometry, that of e comes from calculus; other numbers may be expressed by rules giving the n-th digit; etc. etc.. A rule for saying whether a number is transcendental would have to be limited to numbers specified in a certain way. And there are, indeed, results showing that large classes of numbers described in specific ways are transcendental. An example is mentioned in Exercise 8, p.240. (In that exercise, u should be assumed positive, so that u^r will make sense.) ---------------------------------------------------------------------- You ask whether the method by which the polynomial of Example 6.1.2, p.236 was found doesn't guarantee that it will be irreducible, and hence make the subsequent proof of that fact unnecessary. You may be right that it follows from the way the polynomial is constructed, but I don't see an obvious way to prove this. And if we don't have such a proof, then we must prove irreducibility in some other way, which is what the authors do. Interesting question. ---------------------------------------------------------------------- You ask whether in the proof of Proposition 6.2.1, p.240, the elements c_0, ..., c_n are elements of K(u) or of K. To answer this, you should look at how the book introduces them. They are the coefficients in "the minimal polynomial of u over K". If you are not sure you remember what that phrase implies, you should go back to where it was defined, Definition 6.1.4. This specifies that it is a polynomial in K[x]; so c_0, ..., c_n are elements of K. ---------------------------------------------------------------------- You ask whether the conditions gotten by negating the three parts of Proposition 6.2.3 (p.241) are also equivalent to each other. Yes, definitely! When the proposition says that (a), (b) and (c) are equivalent, this means that if any one is true, then all the others are true; i.e., for each u they are either _all_ true, or _all_ false. Hence the proposition can also be expressed by saying that "not-(a)", "not-(b)", and "not-(c)" are equivalent to each other. ---------------------------------------------------------------------- You ask in connection with Theorem 6.2.4 on page 241 whether for a longer chain of extensions, say D \subset K \subset E \subset F, one has [F : D] = [F : E][E : K][K : D]. Certainly. First apply Theorem 6.2.4 to the chain D \subset E \subset F, to get [F : D] = [F : E][E : D], and then apply the same theorem to the last factor of this formula, [E : D] = [E : K][K : D]. Substituting the above into the preceding formula gives the formula you asked about. ---------------------------------------------------------------------- You ask what the authors mean by "the base field" in the paragraph preceding Theorem 6.4.5, p.253. If F is an extension of a field K, then K is called the "base field" of this extension. ---------------------------------------------------------------------- You ask, in connection with Theorem 6.4.5, p.253 "What is meant by `unique up to isomorphism'?". Recall that we say that an object satisfying a certain condition is _unique_ if any two objects satisfying that condition must be the same. Well, we similarly say that an object satisfying a condition is _unique_up_to_isomorphism_ if any two objects satisfying that condition must be isomorphic to each other. ---------------------------------------------------------------------- You ask why in the proof of Theorem 7.1.1, p.275, they can say that H/N is a normal subgroup of G/N. Because Proposition 3.7.4 shows that the kernel of any group homomorphism is a normal subgroup. ---------------------------------------------------------------------- You ask about the steps in the proof of Proposition 7.1.6, p.277: alpha i_a alpha^-1 (x) = alpha(a (alpha^-1 (x)) a^-1) = (alpha(a)) (alpha alpha^-1 (x)) (alpha (a^-1) The first of these steps applies the definition of i_a. The second applies the fact that alpha is a homomorphism, so that alpha(uvw) = alpha(u) alpha(v) alpha(w); in this case with u = a, v = alpha^-1 (x), w = a^-1. ---------------------------------------------------------------------- You ask about the step "a(xy)a^-1 = (a x a^-1) (a y a^-1)" in the proof of Proposition 7.1.4, p.277. Well, simplify the right-hand side of that equation -- don't you get the left-hand side? Perhaps your confusion came from the idea that every equality in a calculation should represent a simplification -- some straightforward way that "evaluating" the left-hand side gives the right-hand side. But that is not so. Sometimes one must "complicate" things in order to get what we are trying to prove. Since we want to show i_a respects products, we must somehow turn i_a(xy) into i_a(x) i_a(y). Those two expressions are a(xy)a^-1 and (a x a^-1) (a y a^-1) respectively. To do this, one can insert e = a^-1 a between the x and the y of the first expression. (Because of the identity law, inserting an e makes no difference.) They don't say it this way; instead, they give an equation a(xy)a^-1 = (a x a^-1) (a y a^-1) which you can verify by seeing that the right-hand side simplifies to the left-hand side (rather than vice versa). ---------------------------------------------------------------------- You ask why an isomorphism of a group with itself gets a special name, "automorphism" (p.277, Definition 7.1.5). For the same reason that a one-to-one map of a set onto itself gets a special name, "permutation". Namely, when you have maps from a set or group X to itself, you can do various things, and ask various questions, that you can't do or ask for a map from one set to another: You can compose any two map X --> X; this gives an operation of composition on the set of all such maps. Given such a map you can ask what members of X are unchanged by applying it, and whether repeating it some number of times will give the identity map of X. The _invertible_ maps from a group or set X back to X form a group, while the invertible maps from one object X to another object Y don't. So one often wants to talk about such maps and only such maps; so one gives them their own distinctive name. ---------------------------------------------------------------------- You ask what it means to speak of an isomorphism of a group with itself (an automorphism, p.277, Definition 7.1.5), given that every group is isomorphic to itself. The fact that every group is isomorphic to itself is seen by looking at the identity map 1_G defined by 1_G(x) = x for all x\in G. But given a map phi from a group G to itself other than 1_G, one can still ask whether _that_ map phi is an isomorphism. For instance, if G = Z, the map x |-> 2x is not an automorphism (it is not onto), and the map x |-> x+1 is not an automorphism (it does not satisfy phi(x+y) = phi(x) + phi(y)), but the map phi(x) = -x is an automorphism. Automorphisms express a certain "symmetry" in the structure of a group, which is essential for the study of groups. Unfortunately, most elementary algebra texts (including Beach and Blair) tend to introduce the concept of isomorphism as giving statements that a certain group is isomorphic to a certain other group, rather than statements that a certain map between the groups is an isomorphism; so when one comes to the subject of automorphisms, where statements of the first sort are vacuous, students tend to wonder what there is to talk about. ---------------------------------------------------------------------- You ask why the map phi in the proof of Proposition 7.1.8, p.278 is one-to-one. It isn't! If it were one-to-one, the proposition would say Inn(G) =~ G. Instead, it has kernel Z(G), so we conclude that Inn(G) =~ G/Z(G). This is an example of the Fundamental Homomorphism Theorem, p.132. ---------------------------------------------------------------------- You ask why, in Example 7.2.1, p.280, the authors choose to conjugate a by b, rather than by another element. If they conjugated a by an element that commuted with a, i.e., by some x such that ax = xa, then the result, xax^-1, would be the same as a. Now a commutes with every power of itself, so they want to choose an element that is not a power of a, and they chose b. By doing so, they found one element in the conjugacy class of a other than a itself. They might have then gone on to conjugate it by other elements that didn't commute with it, and see whether they could find still other members of that conjugacy; but they give a reason showing why a can't be a conjugate of any other element than itself and a^2; so they can stop there. ---------------------------------------------------------------------- You ask what the one-to-one and onto function referred to in the statement of Proposition 7.2.5 is. It is the function taking each left coset a C(x) to the element a x a^-1. (The proof of the proposition shows that a x a^-1 = b x b^-1 if and only if a and b belong to the same left coset of C(x); hence the function described in the above sentence is well-defined.) ---------------------------------------------------------------------- You ask about the symbols in Example 7.2.4 p.283 having two numbers in parenthesis, one above the other. Such a symbol, with a nonnegative integer n above a smaller nonnegative integer i, represents the number of ways of choosing i objects from a set of n objects, and it is commonly pronounced "n-choose-i", so I will write it that way here. The formula is n-choose-i = n! / (i! (n-i)!); this is also equal to n(n-1)...(n-i+1)/(i(i-1)...1). So, for instance, 4-choose-2 can be computed as 4!/2!(4-2)! = 24/(2.2) = 6, or as 4 . 3 / (2 . 1) = 6. ---------------------------------------------------------------------- I'm not sure whether what I said in class answered your question about the proof of Theorem 7.2.8, p.284. You wanted to understand the statement "x not-in Z(G) implies [G:C(x)] > 1", and its relation to [G:C(x)] being divisible by p. In case I didn't make it clear, I will go over it here. We know that [G:C(x)] is a divisor of |G|, and since in this theorem |G| is a power of p, [G:C(x)] is also a power of p. The powers of p are 1, p, p^2, p^3 , ... ; all of these are divisible by p _except_ 1. So to show [G:C(x)] is divisible by p, we just need to show that it isn't 1. Now the index of a subgroup, [G:H], is 1 only when H = G; so what we need to show is that for the terms [G:C(x)] appearing in the class equation, C(x) is not G. And indeed, the elements x such that C(x) = G are the elements of Z(G), and all the resulting "1"s are gathered together in the term |Z(G)|, while the terms [G:C(x)] in the summation are all > 1. Thus, this shows that (for G a p-group) all those terms are divisible by p. ---------------------------------------------------------------------- You ask about the relationship between the two ways Definition 7.3.1, p.286, defines a group operation: as a "multiplication of elements of S by elements of G" and as a "function G x S --> S". G x S is the set of all ordered pairs (a,x) with a\in G and x\in S. Hence a function G x S --> S means an operation that associates to every such ordered pair (a,x) an element of S. One could call this function f, and write the element f(a,x), but it is common to use an abbreviated notation and call the element a x. Thus, the function is the same as a multiplication. ---------------------------------------------------------------------- You ask whether, in the concept of a group acting on a set, defined on p.286, the group and the set have to be related in some way; for instance "How would it work if the group G consisted of polynomials and the set S was a set of matrices?" Well, if for that G and that S, you found some map G x S --> S that satisfied the conditions of Definition 7.3.1, then you would have defined a group action of G on S. If not, then you wouldn't have such a group action. If you defined one map G x S --> S satisfying those conditions and someone else defined a different map satisfying those conditions, you would have two different group actions of G on S. In other words, the concept of a group action doesn't just depend on G and S and whether they are somehow "related"; it depends on _defining_ what will be the multiplication function. Taking your example, suppose G is the group of all real polynomials under addition, and S is the set of all 2x2 matrices over the real numbers. Given a polynomial P and a matrix M, define P.M to be the matrix e^{P(7)} M. (I.e., evaluate the polynomial P at x = 7 (an arbitrary choice) exponentiate the result, and multiply M by the resulting number.) If you check Definition 7.3.1 (remembering that the group of polynomials is written additively, so you will have to write (a+b) instead of (ab); but the operation of the group on the set is still written multiplicatively), you will find that the conditions of the definition are satisfied, so that is an action of polynomials on matrices. There are plenty of others I could also have defined. ---------------------------------------------------------------------- You ask why Definition 7.3.1, p.286, doesn't include a condition that the operation be well defined. Well, I'd put it the other way around: The statements in the definitions of group, field, etc. that the operations are well-defined are unnecessary, because they are redundant: If a function "isn't well-defined", it's not a function. So as soon as they say "a function" this requires it to be well-defined. I've tried not to tamper with the wording of the book's definitions except where I thought it was absolutely necessary; but if I were writing it, I would not put the words "well-defined" into any of these definitions. (Though I would include warnings to the student that in setting up definitions of functions, if any choice is involved, well-definedness must be checked.) One can ask why the authors included well-definedness in other definitions but not here. I would guess it is because they considered the early parts of the book aimed at a more naive readership, and the later parts for more experienced students, and felt that pointing out the obvious, that a function has to be well-defined, wasn't necessary in the later part of the book. ---------------------------------------------------------------------- You ask why Gx is called the "orbit" of x under G (p.287, Def.7.3.3). The easiest answer is that it describes "everywhere that x goes" when elements of G are applied to it. There's another way of looking at it that is more complicated to describe. One important case of the concept of group action is in physics, where if we let X denote the set of all states of a system, then the group R of real numbers acts on S: If t is a real number and x a state, then t x denotes the state that a system in state x will end up in after passage of t units of time. In this situation, R x actually represents the whole "path" that the physical system moves through with the passage of time, if its state at time 0 is x. (This is not necessarily a periodic path, like the orbit of a planet. However, the Latin word from which we get "orbit" doesn't just mean a periodic path; it means any path.) ---------------------------------------------------------------------- You ask whether the identity element belongs to S^G, as defined on p.287. No, S^G is a set of elements of S, while the identity element belongs to G, and (except in special cases) these are unrelated sets. But perhaps you mean "Is the identity element one of the elements "a \in G" appearing in the definition of S^G ?" In that case, the answer is definitely "yes". However, even though every element x in S satisfies e x = x, that doesn't make every x belong to S^G. The definition says "... for all a\in G", while the fact that e x = x merely shows that the condition holds "for some a\in G". ---------------------------------------------------------------------- You ask about the relation between Theorem 7.2.6 and Theorem 7.3.6 (p.289). The earlier theorem is a special case of the later one -- the case where S is the group G itself, and the action of G on S is by conjugation, as in Example 7.3.5. ---------------------------------------------------------------------- You ask, in connection with the end of the proof of Theorem 7.3.8, p.290, "how the result that |S^C| == 0 (mod p) and |S^C| not-= 0 is the same as the number of solutions of x^p = e being a multiple of p". In the proof of Theorem 7.3.8, each member of S^C corresponds to a solution of x^p = e. So |S^C| equals the number of such solutions, and the result |S^C| == 0 (mod p) says that that number is a multiple of p. The statement |S^C| not-= 0 is then used to deduce that this multiple of p is >0, hence is at least p, hence that there is more than one solution to x^p = e, hence there is a solution other than x = e, i.e., an element of G of order p. ---------------------------------------------------------------------- You ask in connection with the proof of Lemma 7.5.3 on p.297, how we know that an element g \in can be written g = a_1^m_1 ...a_n^m_n . For the answer, look at "Lemma 3.2.15" in the note "The subgroup of a group generated by a set" that I put on the back of the homework that was due October 3, and also observe that if G is commutative, we can take the terms called t_1 ... t_n in that lemma, and rearrange them so that all the terms a_1 and a_1^-1 come first, followed by all the terms a_2 and a_2^-1, etc.. Combining all the terms a_1 and a_1^-1, we get one term a_1^m_1 (where m_1 will be positive, negative, or zero depending on whether there were more terms a_1, more terms a_1^-1, or equal numbers); combining all the terms a_2 and a_2^-1, we get a term a_2^m_2, etc.. ---------------------------------------------------------------------- You ask in connection with the proof of Lemma 7.5.3 on p.297, how G/ being a direct product allows us to deduce from "a_1^m_1 ...a_n^m_n = " that for each i, we have a_i^m_i = . In saying "G/ is a direct product", they really mean that the map < a_1 > x < a_2 > x ... x < a_n > --> G/ is an isomorphism. Now that map sends the n-tuple (a_1^m_1 , ..., a_n^m_n ) to a_1^m_1 ...a_n^m_n . Hence if the latter element is , the identity element of G/, then the n-tuple must also be the identity element of the product-group, i.e., must be the n-tuple (, ..., ); so each component a_i^m_i must be . ---------------------------------------------------------------------- You ask about the relation between the Fundamental Theorem of Finite Abelian Groups (p.297) and the Fundamental Theorem of Arithmetic. They are similar in that each shows how something can be broken up, in a unique way, into "standard" parts. But they differ in the details. For instance, the Fundamental Theorem of Arithmetic gives exactly one factorization of 8 into primes: 8 = 2^3. But the Fundamental Theorem of Finite Abelian Groups shows that there are three different nonisomorphic abelian groups of order 8: Z_8, Z_4 x Z_2, Z_2 x Z_2 x Z_2. ---------------------------------------------------------------------- You ask why, at the end of the third paragraph of section 9.1, p.356, they don't require that delta(rs) >_ delta(r). Because that condition is included in the condition they do state, namely that for all r and s, delta(rs) >_ delta(s). Since it applies to all r and s, if we take any two such elements, you can apply it to them with their roles reversed, and get delta(sr) >_ delta(r). But the ring is commutative, so sr = rs, and this gives the inequality you ask about. ---------------------------------------------------------------------- You ask why Euclidean domains are called "Euclidean" (p.357). Euclid (around 300 BC) described the algorithm for finding the g.c.d. of two integers by repeated division with remainder (p.10). When this same method was generalized (thousands of years later) to various other rings, the name "Euclidean algorithm" was still used for it; so when people defined the general concept of a ring in which this method could be applied, they called it a "Euclidean domain". ---------------------------------------------------------------------- You ask about the function delta in the definition of Euclidean domain, Definition 9.1.1 on p.357. An integral domain D is called a Euclidean Domain if there is _any_ function delta satisfying the conditions of Definition 9.1.1. (The authors' phrase "there is assigned" is a bit confusing; I will suggest that they change it.) ---------------------------------------------------------------------- You ask, in connection with Theorem 9.1.2, p.358, for an example of a principal ideal domain that is not Euclidean. The use of a Euclidean norm is by far the easiest way to show that a ring is a principal ideal domain, so an example of the sort you ask for actually takes considerable work to establish. It is easy to _name_ such a ring: Z[(1 + sqrt -19)/2]. For proofs of the two essential facts -- that it does not admit a Euclidean norm, and that it is a principal idea domain -- you can go to my web page http://www.math.berkeley.edu/~gbergman , click on "Handouts for graduate courses", and then click on the second item there, "A principal ideal domain that is not Euclidean, developed as a series of exercises". I've just looked it over, and I don't think it uses any results beyond the scope of math 113. It refers to a certain set of complex numbers as a "module" over R; in this context, that simply means the set is an additive group, and closed under multiplication by members of R. It uses "phi" and the term "Euclidean function" where Beachy and Blair use "delta" and the term "Euclidean norm". If you try working through it and have problems, you can ask me. ---------------------------------------------------------------------- You ask about the possibility of p being a unit, in the second paragraph of the proof of Proposition 9.1.9 (p.360). Good point! If p is a unit, then pR = R, so pR is not a proper ideal of R, so by Definition 5.3.8 (p.224), it is not prime. So the result is still valid in this case, but the authors should have considered this case in their proof. I'll point this out in my letter of suggestions to them. ---------------------------------------------------------------------- You ask, in connection with the proof of Theorem 9.1.12, p.361, whether there are rings in which a (nonzero nonunit) element d can't be written as a product of irreducibles. I hoped to give examples yesterday, but I ran out of time. Here is one: Let R be the ring generated over Z by the square root of 2, the fourth root of 2, the eighth root of 2, etc. etc. (i.e., by adjoining to Z the 2^n-th root of 2 for _all_ positive integers n). Then in R, 2 = (square root of 2)(square root of 2) square root of 2 = (fourth root of 2)(fourth root of 2) fourth root of 2 = (eighth root of 2)(eighth root of 2) . . . 2^{n-1}st root of 2 = (2^n-th root of 2)(2^n-th root of 2) . . . so no matter how far one goes, each of the factors one gets can be factored further, and one never gets irreducibles. ---------------------------------------------------------------------- You ask how, at the end of the proof of Lemma 9.2.4, p.364, the equation ad f*(x) = bc g*(x), and the fact that a is an associate of b, and c is an associate of d, imply that f*(x) and g*(x) are associates. The statement that a is an associate of b means that one can write b = ua, where u is a unit; similarly c = vd where v is a unit. Substituting into the equation ad f*(x) = bc g*(x), we ad f*(x) = (ua)(vd) g*(x). Cancelling the nonzero factors a and d gives f*(x) = uv g*(x), which makes f* and g* associates. ---------------------------------------------------------------------- You ask about the statement that 1 + 2 gamma and 1 - 2 gamma are irreducible, in Example 9.2.1, p.365. This is shown in the pair of sentences beginning with the last word of p.365. Those sentences show that if we assume 1 + 2 gamma = ab, with a and b nonunits, we get a contradiction. You also ask about the factorization 21 = (1 + 2 sqrt -5)(1 - 2 sqrt -5) at the end of that example. One verifies this simply by multiplying out the right hand side. The way the authors present this makes it seem to come out of a hat; if I were doing it I would begin by saying "Let's take a random element of Z[sqrt -5], say 1 + 2 sqrt -5, and let us multiply it by its conjugate to get an integer: (1 + 2 sqrt -5)(1 - 2 sqrt -5) = 21. We know 21 can be factored 3 . 7. Are the factors in the two factorizations of this number, 3 and 7 on the one hand, and 1 + 2 sqrt -5 and 1 - 2 sqrt -5 on the other, irreducible in Z[sqrt -5] ? If so, and if neither 3 nor 7 is an associate of 1 + 2 sqrt -5 or 1 - 2 sqrt -5, then this will show that we have nonunique factorization in this ring"; and then I would give the calculations showing that the elements are indeed irreducible and not conjugates of one another. ---------------------------------------------------------------------- You ask about the proof of de Moivre's Theorem, p.389 bottom. As the authors note in the sentence before it, it is proved by repeated application of the formula r (cos phi + i sin phi) . t (cos theta + i sin theta) = rt (cos (phi + theta) + i sin (phi + theta)) to the complex number cos theta + i sin theta. So try applying the above formula to (cos theta + i sin theta)^2 = (cos theta + i sin theta) . (cos theta + i sin theta); then try using the same formula to find (cos theta + i sin theta)^3 by multiplying the result of your first calculation by cos theta + i sin theta again. By this point, you should see the pattern, and be able to translate it into a proof by induction. Let me know if you have any difficulty doing this. ---------------------------------------------------------------------- You ask whether complex conjugation (p.393) has any connection with conjugation in the sense of group theory. Not directly. What they have in common is that both give automorphisms: for any element a of a group G, i_a: G --> G is an automorphism of the group G, while complex conjugation is an automorphism of the field C. I suspect that the names were given to the two operations at a time when the concept of "automorphism" had not yet been defined. People working with complex numbers realized that numbers of the form a + bi and a - bi had parallel properties, and people working with groups realized that elements g and a g a^-1 had parallel properties, and in both cases they called these elements "conjugate", from the Latin meaning "yoked together". Once the concept of automorphism was defined, one could explain that these "parallel properties" resulted from the fact that in each case, one element was the image of the other under an automorphism. ---------------------------------------------------------------------- You ask whether induction is the best way to prove Theorem A.7.4, p.405. Yes; the proof involves building up a set step by step, and induction is the way to formulate an argument of an arbitrarily large number of steps where each step depends on the step before. However, an author can give an inductive argument without using the word "induction". He or she can instead say things like "Repeat this process until all n elements of T have been used". Whether one calls it induction is a matter of style; but it is still induction, either way. ----------------------------------------------------------------------