ANSWERS TO SOME QUESTIONS ASKED BY STUDENTS in Math 245, taught from my notes "An Invitation to General Algebra and Universal Constructions", http://math.berkeley.edu/~gbergman/245, Spring 2008, Fall 2011, Spring 2014, Fall 2015, and Fall 2017. These are my responses to questions submitted by students in the course, the answers to which I did not work into my lecture, but which I thought might be of interest to more students than the ones who asked them. (Responses to questions submitted before 2015 have been adjusted to the numbering of the 2015 version of the text. Since the published version and the online preprint version differ in pagination, I refer to results by number but not by page. All chapter-numbers have increased by 1 since the pre-2015 versions of the text, because the software used in the published version does not allow a text to start with "Chapter 0", as this text previously did.) ---------------------------------------------------------------------- You ask whether, in Exercise 2.2:2, when I speak of groups yielding the same pair, G'_1 = G'_2, I mean isomorphic pairs. No; by equality I mean equality! If you have proved a result about isomorphism, you can submit that as homework -- investigations of questions of one's own devising are accepted as homework, if they are relevant to the course -- but the question asked was about genuine equality. ---------------------------------------------------------------------- Regarding the construction of group-theoretic terms in section 2.5, you ask whether we need to consider $X$ of uncountable cardinality. That depends on what purpose we will be using our terms for. If we want to use them to write down identities, then the countable case is enough. (We'll show that explicitly, toward the end of the course, for arbitrary sorts of algebras. It is the second paragraph of Lemma 9.4.2, in the case where \gamma = \aleph_0.) On the other hand, if we have some explicit uncountable group G (for example, the group of all bijections from a countable set S to itself), and we want to reason about all relations satisfied by an uncountable set X of elements of G, then we need to use terms in that uncountable set. ---------------------------------------------------------------------- You ask why, in the middle of the paragraph following Exercise 2.5:2, I say that \iota_T(x.y) would be x.y^{-1}, rather than (x.y)^{-1}. I say earlier in the paragraph that "We need to be careful". This example illustrates how a rule that one might naively propose for defining terms and term operations as strings of symbols and operations on those strings could go wrong. The rule in question defines \iota_T to simply append the symbol ^{-1} to whatever symbol one plugged into it; and as this example shows, it would not have the properties one wants. In the next paragraph I talk about using parentheses as part of one's string of symbols, which does give what one wants. (It uses more parentheses than the minimal number needed, but at least it works.) ---------------------------------------------------------------------- Regarding my statement in the next-to-last sentence of section 2.6, that when one forms the set T of group-theoretic terms in \{x,y,z\}, the term "y" represents "the ternary second-component function", you point out that as a set, \{x, y, z\} doesn't specify that y is in "second position". Good point. I agree that writing X = \{x, y, z\} in no way puts an ordering on x, y, z. I guess I was thinking of the symbols x, y and z as symbolically meaning "first variable", "second variable" and "third variable", as though they were written x_1, x_2 and x_3. I'll have to think about what to do here in the next revision. Thanks for pointing it out! ---------------------------------------------------------------------- You ask whether the two downward arrows in the diagram in Definition 3.1.3 represent the same map h, or whether there is a distinction between the map regarded as a set map and as a group homomorphism. They do indeed represent the same map h. Since a group homomorphism is a set-map that satisfies appropriate conditions with respect to the group operations, the same map can be both a set-map and a homomorphism. Its occurrence in the triangle is based on the fact that it is a set map, and so can be composed with the horizontal arrow of that triangle, and the result equated with the diagonal arrow, giving us a triangle of set maps. Its appearance on the right, on the other hand, is based on its being a group homomorphism. It is unique for having the combination of the set-theoretic and group-theoretic properties shown by these two appearances. ---------------------------------------------------------------------- You ask about the remark after the last diagram in section 3.1, that since any two free groups on X are "essentially" the same, one often speaks of "the" free group on X. Whether it is reasonable to use "the" in such a situation depends on what one is focusing on. If one is interested in the group-theoretic properties of the group, these are the same for any two isomorphic groups, so one thinks of all free groups on X as being "the same for the purposes of discussion", and it is natural to use "the". On the other hand, if one is interested in questions such as whether the set X is actually contained in F, or mapped into it by a map u which is not an inclusion map, then this will differ for different realizations of the free group structure, and one would call each of them "a free group on X". Regardless of one's point of view, it is not quite precise to say "the" free group, since the different objects with that universal property are not literally the same. But it is OK to speak a little imprecisely as long as the speaker and hearer understand what is actually meant. ---------------------------------------------------------------------- Your pro-forma question was why conditions (3.2.4) and (3.2.5) needed to be specified to be sure T/~ was a group; specifically, why they didn't follow from the other relations. The idea of your answer was right -- that instead of assuming that "~" is a relation of a naturally occurring sort, which one could expect to satisfy (3.2.4) and (3.2.5), one should think in terms of coming up with an "artificial" relation, which would have no reason to satisfy those conditions merely because it satisfies the others. One can, in fact, describe a concrete way of getting such "artificial" relations. Start with a relation ~_0 which does come from a map v of X into a group G; and then let ~ be an equivalence relation containing ~_0, gotten by choosing two equivalence classes [p] and [q] of ~_0 and "joining them into one" under the new relation. You should find it easy to show that ~ will then satisfy (3.2.1)-(3.2.3) and (3.2.6)-(3.2.8), but, assuming ~_0 had more than just the two equivalence classes [p] and [q], that ~ will not satisfy (3.2.4) so that (3.2.9)-(3.2.11) will not give well-defined operations on G. You also asked whether (3.2.4) (with the other conditions) would at least imply (3.2.5). You thought it probably wouldn't; but in fact it will. One can show this using the fact that from (3.2.4) and the other conditions, T/~ acquires a structure with a well-defined multiplication operation satisfying the consequences of (3.2.1)-(3.2.3). It follows from (3.2.3) that for each element [p] of that structure, [p^{-1}] will be an inverse, and then arguing that inverses must be unique. So what is the point of including (3.2.5)? To give a procedure which, without such subtle arguments, shows the existence of free groups, and which incidentally can be used with other sorts of algebraic structures to which those subtle arguments might not be applicable. ---------------------------------------------------------------------- You ask what is meant by a "predicate", where I say in the third paragraph after display (3.2.8), that if we think of relations as predicates, then the "intersection" operator on them becomes the logical operator "and". A "predicate" means, roughly, an assertion, with some set of blanks to be filled in. In grammar, when one divides the sentence "The man is happy" into "subject and predicate", one calls "the man" the subject, and "is happy" the predicate; and that predicate can be combined with other subjects to give other sentences. So "predicate" is extended by logicians to mathematics, where conditions such as "=" and "isomorphic to", can be called binary predicates, "is positive" (among real numbers) and "is prime" (among natural numbers) can be called unary predicates, etc.. A relation can be formalized as a set of pairs, or thought of as a predicate, and the notation varies accordingly. ---------------------------------------------------------------------- You ask for references to how the set-theoretic difficulties with "generalized operations" raised in the 2nd paragraph after Exercise 3.3:5 can be resolved. Actually, they can be resolved using the ideas of section 7.4 of these notes. I just didn't want to point the reader in section 3.3 to something that he or she might find hard to follow at this point. ---------------------------------------------------------------------- Regarding the paragraph preceding Proposition 3.4.6, you ask how the description of what s_v does to a implies that s \neq t in T_{red} implies s_v(a)\neq t_v(a). Well, the description shows that if s\in T_{red}, then s_v takes a to the word consisting of precisely the string of symbols of s, with parentheses removed, followed by a. Now if s and t are different members of T, then on removing parentheses, they still have different strings of symbols, since members of T_{red} all have parentheses clustered to the right (see discussion leading to (3.4.1)) and "^{-1}" applied only to elements of X, not to longer parenthesized expressions. Hence adding an a at the end, we get different words s_v(a) and t_v(a). Does this make sense now? ---------------------------------------------------------------------- Regarding the construction of the quotient group at the end of section 4.1, you ask whether there is some other way of constructing that group either "from above" or "from below" ... When we find the normal subgroup N generated by a set S of elements of G, we can do this either "from above" or "from below", as discussed in that section. But once we have found it, all we have to do to impose the corresponding relations on G is to divide out by it. So the "from above / from below" distinction comes in at an earlier point in this construction. ---------------------------------------------------------------------- You ask about the "striking properties" of the group introduced in Exercise 4.3:12. The most striking one involves the concept of a "left orderable" group -- a group G that can be given a total ordering "\geq" such that whenever two elements g,h\in G satisfy g\geq h, and f is any element of G, then one also has fg\geq fh. It is easy to show that such a group has no element of finite order other than e, and for many years it was an open question whether the converse was true. This group turned out to be a counterexample. (I've recently learned that the person who discovered this fact, David Promislow, had been running a computer program to test whether the group had a certain property known to be necessary for a group to be right orderable. The computer turned up a counterexample to that property, and Promislow at first thought there must be an error in his program; but he checked it and found it was right.) ---------------------------------------------------------------------- You ask what is meant by "normal form" in section 4.4 (2nd paragraph after (4.4.3).) Did you check out the phrase in the index, and re-read the paragraph where it is defined (the boldface number in the index entry)? ---------------------------------------------------------------------- You ask why the commutator subgroup, referred to in section 4.4, is also called the derived subgroup. I don't know. It might simply be that early in the development of group theory, it was the one important general way of constructing a subgroup from a group, so it was given a very "basic" sort of name. Or it might be that group theorists first started using the symbol G' for this important construction, and then gave it the name "derived" because in calculus, f' denotes the "derivative" of f. ---------------------------------------------------------------------- You asked, in connection with exercise 4.4:7, "Is there any reason not to consider a free solvable group on a set X?" Well, that is essentially what the last sentence of the exercise is asking _you_! So you need to look at the construction of free groups and the other universal objects introduced so far, and also look at the concept of "solvable group", and see whether you can adapt the general methods of constructing free objects to that concept. In trying to do so, you need to look at the differences between the concepts of "group" and of "solvable group", and see whether those difference pose obstructions to one or another of the methods of constructing free objects; if you find that one of those methods goes over with no changes at all, point this out. If all of them run into difficulties, note what those difficulties are, and see whether you can overcome them in at least one case. If you can't, then try to prove a non-existence result. Have you attempted any of this? If so, what were your results? ---------------------------------------------------------------------- You ask about the reason for the term "residually finite", introduced just before Exercise 4.5:2. Well, in certain contexts, a factor object of a mathematical object is said to consist of "residues". I guess this goes back to the ring Z/nZ, where people often think of its set of elements as \{0, 1, ..., n-1\}, i.e., the residues that one gets on dividing arbitrary elements of Z by n. The most common use of the term I am aware of in algebra is when R is a local ring with maximal ideal p; then R/p is called "the residue field" of R. Perhaps in group theory G/N was at some time called the "residue group" of G on dividing by N. Then if G can be thought of "living in" the direct product of its finite residue groups, it is reasonable to call it "residually finite". Other words can be substituted for "finite". E.g., if the elements of a group can be distinguished by homomorphisms into solvable groups, then it is "residually solvable". I'm so used to this use of "residually" that I hadn't even thought of where it came from. ---------------------------------------------------------------------- Concerning the proof of Proposition 4.6.5, you say > I don't quite see what the purpose of constructing this set A is ... I hope that what I said in class answered your question. Though it is easy to see that every element of the universal group generated by images of G and H can be written in the form (4.6.4), and to define operations on such expressions, it is not easy to show that these operations satisfy the group axioms; equivalently, that the group identities, together with the relations satisfies by our generators, don't somehow imply equality among two different expressions (4.6.4). But by finding a group of permutations of a set on which all those relations are satisfied, and in which different expressions (4.6.4) yield different permutations, we get this conclusion, and hence have the desired description of the "coproduct" of the groups G and H. ---------------------------------------------------------------------- You ask why there are different symbols for direct sums and direct product of abelian groups, even though, as noted in the 5th paragraph of section 4.7, they are the same construction. I think it's mainly historical. The symbol A (+) B developed within group theory and module theory, as representing an abelian group or module in which every element was uniquely representable as a sum a + b with a\in A, b\in B. The symbol A x B developed within set theory, probably based on the fact that when A and B are finite sets, with A having m elements and B having n elements, then A x B is a set with m n elements. The concept "A x B" spread to all areas of mathematics, since the concept of direct product is important everywhere (e.g., if R is the real line, then R x R is the plane). In particular, if for A and B are two rings, or (not necessarily commutative) groups, then A x B has a natural structure of ring or group. In afterthought, one sees that when A and B are abelian groups, this construction A x B is isomorphic to the construction A (+) B that people had been using all along. So the result was two symbols for the same construction. But as mentioned in the notes, for infinite families, the corresponding constructions are distinct; another reason for using distinct symbols. ---------------------------------------------------------------------- You asked whether the map X --> F has to be an inclusion. Please remember, in submitting future Questions of the Day, to specify what point in the text your question refers to. I think you were referring to the map from a set X to the underlying set of the free monoid on X, introduced near the beginning of section 4.10. In constructions of free algebraic objects, the map u: X --> F does not have to be an inclusion. For instance, one of the constructions of a free abelian group that we saw in section 4.4 took the underlying set of the group to consist of integer-valued functions on the set X, and the universal map carried each x\in X to the function X --> Z which had the value 1 at x, and 0 at all other points. The element x is certainly not the same as that function. The map u will, however, in general be a one-to-one map (often called an "injection".) In our notation, we often use the same symbol, e.g., x_i, for an element of x, and its image in our universal object, relying for context to tell us when x_i itself is meant, and when u(x_i) is meant. But that is simply a shorthand to avoid messy notation. ---------------------------------------------------------------------- You ask whether for monoids, as for groups, the free object on a larger number of generators can be embedded in the free object on a smaller number of generators. Yes. Actually, for monoids examples like that are "easier" to come by in one way, but "harder" in another, than for groups. "Easier" in that within the free monoid on \{x,y\}, the countably many elements xy^n (n=0,1,2,...) are free generators of a free submonoid. This is not so for groups: if we abbreviate xy^n to z_n, we find that (z_m)^{-1} z_n = (z_0)^{-1} z_{n-m}, so the elements z_0, z_1, ... satisfy nontrivial group relations. (However, there is also an easy example in free groups: the elements w_n = y^{-n} x y^n generate a free subgroup.) "Harder" because while it is known (though not that easy to prove) that every subgroup of a free group is free on some set of generators, this is not true in monoids. E.g., if x\in X and F is the free monoid on X, then the elements x^m as m ranges over all nonnegative integers other than 1 (or more generally, over all nonnegative integers greater than some fixed m>0), form a submonoid which is not free. ---------------------------------------------------------------------- You asked, regarding the connection between the Weyl algebra and Quantum Mechanics mentioned after Exercise 4.12:3, whether I could recommend any texts giving a mathematical treatment of Quantum Mechanics. I asked a few colleagues, and two books were suggested. One is an old book, "Mathematical Foundations of Quantum Mechanics" by George W. Mackey, now republished by Dover. (Dover republishes old out-of-copyright mathematical works at low prices -- a valuable service!) The other is "Introduction to Quantum Mechanics" by Hannabuss. I also learned that we have a course in the mathematics of quantum mechanics, Math 189. ---------------------------------------------------------------------- You asked about the treatment of Galois theory via tensor products, mentioned after Exercise 4.13:4. There are two write-ups of courses Lenstra has taught on the subject; he describes them as "each having its own imperfections": http://websites.math.leidenuniv.nl/algebra/topics.pdf http://websites.math.leidenuniv.nl/algebra/Galoistheoryschemes.pdf The first is notes from a 250A he taught here, written up by him and two students. The second is from a course he gave long ago in Leiden (written up by an unknown person, and found on the web), which did the analog of Galois theory for rings more general than fields. He would be happy to learn of any errata you find in either of them. You also mention the site http://en.wikipedia.org/wiki/Grothendieck's_Galois_theory Unfortunately, at the moment Wikipedia seems to be down, so I can't check it out. ---------------------------------------------------------------------- You ask about the statement following Exercise 4.14:3 that the idempotents in a commutative ring R correspond to the continuous {0,1}-valued functions on its spectrum. I assume below that you are familiar with basic algebraic geometry: On the one hand, suppose r\in R is idempotent. Then 0 = r - r^2 = r(1-r), so any prime ideal P contains either r or 1-r. Assume P contains r. Then 1-r, being 1 -(member of P), is not in P, so it is invertible mod P, so when one localizes at P one can cancel it from "r(1-r)=0", getting r = 0 in the localization. Likewise, if 1-r\in P, then in the localization, 1-r=0, i.e., r=1. So the continuous function on Spec R induced by r is everywhere {0,1}-valued. Inversely, if f is a continuous {0,1}-valued function on Spec R, then the subsets of Spec R on which it is 0, respectively 1, will be open-closed. Hence in the structure sheaf, the function agreeing with the global section 0\in R on the first of these sets and with 1\in R on the other, i.e., f itself, will be a global section, i.e., a member of R. ---------------------------------------------------------------------- Regarding the construction of coproducts of sets in section 4.15, you write > Let's say we have S=(S_0,S_1) we are forming the coproduct Q. If a,b > are in S_0, and if (a\cup b)=c, then we would expect the left > injection, as a homomorphism, to satisfy ... "Homomorphism" is a concept relevant to algebraic structures; it means map between underlying sets that refers to the sort of algebra in question. When we talk of a coproduct of sets, these are sets without any additional structure; the analog of a homomorphism is just a map of sets; so it is not expected to satisfy any other conditions. I guess you were thinking that a homomorphism of sets should respect "the kinds of things one studies in set theory", which includes things like the operation of taking unions of sets. That kind of structure is what we studied in the section on Boolean algebras. But the concept of a map of sets is simply that of a function, so the coproduct of a family of sets is simply a universal instance of a set with a function from each of those sets into it. ---------------------------------------------------------------------- You ask about the use of the phrase "generators and relations" with respect to sets, following Exercise 4.15:1, asking how one can "generate" anything when one is not given any algebra-operations. We are carrying the phrase "generators and relations" over from the cases of groups, monoids, rings, etc., in order to show the parallelism of the constructions. But in the case of sets, the situation is indeed degenerate, and nothing more is "generated" than the given elements of X. ---------------------------------------------------------------------- Regarding the construction of the set obtained from a set X by imposing relations given by R following Exercise 4.15:1, you write > ... the relation will make certain elements of X equivalent and > then we just need to pick one of them as a representative. ... That is an old-fashioned way of looking at these constructions. It sometimes has advantages; but usually the nicer way is to let the set of equivalence classes itself be one's new set . The one "disadvantage" of that approach is the problem of visualizing as a "collection of collections". The way I tell my students in Math 113 to think of the result of "dividing" a set by an equivalence relation is that such a set consists of new elements, each of which arises by "gluing together" one or more elements of R. No mathematical difference; but picturing them as "stuck together" rather than "loose" may be less confusing. One way or the other, the idea is to have a set having a many-to-one relationship with X: different elements of X correspond to the same element of the new set if and only if they are in the same class under the equivalence relation. ---------------------------------------------------------------------- In connection with the examples at the beginning of section 4.17, you ask (explaining that you have not yet taken a topology course) what is meant by "closure". In general, the closure of a subset S of a topological space T means the set of points of T that are either in S, or are limit points of members of S. When the topology comes from a metric (distance function), something that you will have seen in Math 1AB and 53 (even though not under the names "topology" and "metric"), a limit point of S is simply a point which is the limit (in the sense of those courses) of a convergent sequence of points in S. So, for instance, when T is the real line, and S is an open interval (a,b), its closure is the closed interval [a,b]; while for the same T, and S the set of rational numbers, the closure of S is the set of all real numbers. This description of the closure is not valid for topological spaces that aren't metric spaces, as noted in lines 4-6 after Exercise 4.17:1, but it at least gives a start at picturing the concept. > ... How does this not hold in the example (4.17.1)? (4.17.1) consists of two examples, side-by-side; let's talk about the first. It represents the image of the real line under a continuous map into a compact rectangular region of the plane. The closure of that image consists of the image itself, the left endpoint of the wiggly line (which is not itself in the image), and, on the right side -- well, you can see that limits of sequences of points in that wiggly line will exist all up and down a vertical interval. So there is no way to extend the map R --> K to a continuous map R\cup\{+-\infty\} --> K: the point +\infty would have to simultaneously be mapped to all the points of that vertical interval. The other example is similar. ---------------------------------------------------------------------- You ask whether, if we replace "compact Hausdorff" with "locally compact Hausdorff" in the definition of the Stone-Cech compactification (Definition 4.17.2), we get a "local Stone-Cech compactification" construction, which turns Q into R. Unfortunately, no. To see this, consider any irrational number \alpha, and let f be the inclusion-map of Q into R - \{alpha\}. The space R - \{\alpha\} is locally compact (every point of that space has compact neighborhoods), but it has no point "where \alpha should go", so the map f does not factor through the inclusion of Q in R. Intuitively, this shows that in making Q locally compact, there is no need to insert an element "where \alpha should go". Since \alpha was an arbitrary irrational number, there is in fact no extra point that "has to be inserted". Yet if we don't insert any point, we don't get local compactness; so I don't think there is a universal local-compactification. For a little more intuition, let \beta be the cube root of 2. Let h: Q --> Q be the function such that f(x)=x for x not between \beta and \beta^2, but f(x) = 2/x for x in that interval. This self-homeomorphism of Q turns that interval upside-down, while leaving the rest of Q unchanged, showing that the topology of Q is far from determining its order-structure. Hence there can be no natural way to construct from the topological space Q the space R, whose topology does almost determine its order-structure. (It determines it up to reversal.) However, if we regard Q as a metric space, then that metric space certainly does determine the metric space R, namely, as its completion; and the completion construction can indeed be regarded as a universal construction on metric spaces. ---------------------------------------------------------------------- You ask whether Lemma 4.17.3 can be proved without the Hausdorffness condition on K. No. For example, let K be any set, given with the topology under which the closed subsets are K itself and its finite subsets. Then any infinite subset of K is dense. Hence if we let X be any infinite set, we can map it into topological spaces K constructed as above so as to have dense image, though there is no upper bound on the cardinalities of such spaces K. ---------------------------------------------------------------------- Regarding the discussion in the middle of section 4.18, you write > You note that for a space to possess a universal covering space > it must in general be semi-locally simply connected. ... No; I note that this and the other conditions listed are the assumptions that are shown in [90] to be sufficient to make the construction described work; I don't say that a universal covering space can't exist if they don't hold. The author of [90] may even have known of more general hypotheses under which a universal covering space exists, but decided that it would be best to prove a result that applied to most "naturally arising" spaces, rather than go through a much messier argument to cover some additional pathological cases. In particular, it looks to me as though some spaces that do not satisfy the condition of being locally pathwise connected (which is also in the list), can have universal covering spaces. E.g., consider the subset of the plane consisting of the union of the line-segments y = cx with x\in [0,1], and c taking on the value 0 and all values 1/n for positive integers n. This X is contractible, hence simply connected, so it should be its own universal covering space; but it is not locally pathwise connected (that fails near any point (x,0) with x>0). However, it is indeed hard to see how a space that is not semi-locally simply connected could have a universal covering space. To see why, let us assume for simplicity that this condition fails in the neighborhood of the base-point x_0. Then there are non-contractible loops that stay arbitrarily close to that point. Suppose (p_i)_{i=1,2,...} is a sequence of such loops that stay closer and closer to x_0. Then the sequence p_i will approach the trivial loop, which stays at x_0. Hence by continuity of the map p\mapsto \~{p}, their liftings \~{p_i} should be approaching the constant base-point map in the universal covering space. But since they are non-contractible, they should end up in different "layers" of that covering space. If points in those different layers can approach the basepoint, this contradicts the definition of covering space, which forces the inverse image of every point of X to be discrete. However, for non-locally-pathwise-connected X, maybe it would be most natural to modify the definition of covering space, e.g., by replacing "discrete" with "totally disconnected". (Note: I haven't actually looked at algebraic topology since I was a student in the '60's; so the above comments are not based on reliable expertise.) ---------------------------------------------------------------------- Regarding Definition 5.1.3, you ask whether there is a reason why we speak of "isotone maps" of partially ordered sets, rather than "homomorphisms". When we get to category theory (Chapter 6), we'll introduce the general term "morphism", covering the various sorts of maps that come up in different areas of mathematics: homomorphisms of algebras, isotone maps of partially ordered sets, continuous maps of topological spaces, etc.. Till then, we are using the traditional terms; and "homomorphism" is traditionally used for functions that respect operations on algebras. If we have entities that mixes algebraic and non-algebraic structure, such as topological groups, we may say "continuous homomorphism" or "homomorphism as topological groups"; but one rarely uses "homomorphism" when there are not some operations to be respected. ---------------------------------------------------------------------- In connection with the concept of an isotone, i.e., order-respecting map, defined in Definition 5.1.3, you ask whether there is a term for an order-reversing map. Yes. It's called an "antitone" map. Not too commonly used, but it exists. ---------------------------------------------------------------------- In connection with the paragraph before Definition 5.1.4, which notes that partially ordered sets are not algebras in the sense used in this class, you ask "Could we not get around this issue by concentrating on relations instead of operations?" Yes; but then objects such as groups, rings, etc. would have to be described as having certain relations R subject to the condition that for every x and y there exists a unique z such that (x,y,z)\in R; and results on free groups, rings, etc., groups, rings, etc. presented by generators and relations, and so on, would all have to be formulated in terms of relations with this particular property. So to create a comfortable context for studying these things, I have chosen to work with algebras, defined by sets with operations, and to consider more general relational structures to be nearby relatives which we visit when we need them. ---------------------------------------------------------------------- You suggest that the rule that associates the graph called the Hasse diagram to a finite poset (two paragraphs before Definition 5.1.6) can be used for arbitrary infinite posets P as well. Unfortunately, the graph you get can lose a lot of information from an infinite P. Think of what you get from the poset of real (or rational) numbers! Can you figure out under what conditions on P that diagram will preserve all the order relations? ---------------------------------------------------------------------- You ask about the 1/3 and 2/3 in Fredman's Conjecture (Exercise 5.1:11). These are forced on us by the 3-element partially ordered set that has one pair of comparable elements, and one element incomparable to both of these. (E.g., the set of integers {2,3,4} under divisibility.) There are only three linearizations of that ordering, so for any pair of incomparable elements, the ratio of the linearizations that put one above the other to those that put them in the reverse order can only be 1:2 or 2:1; so the former have to constitute 1/3 or 2/3 of the linearizations. One can get more examples where the best one can do is 1/3 or 2/3 from the above one; e.g., by throwing in a chain of elements all lying above the three elements mentioned, and/or a chain of elements all lying below them, or in more ingenious ways, such as by putting one copy of the above poset on top of another. But I would guess that if one excluded posets constructed in these ways, then on those that remained, one could assert some narrower interval around 1/2 than [1/3, 2/3]. (One might do this exclusion by defining on each poset P the equivalence relation generated by the condition of being incomparable under the given ordering. Then the posets that would have to be excluded would be those in which all equivalence classes had cardinality 1 or 3, and the 3-element equivalence classes had two elements comparable under the ordering.) ---------------------------------------------------------------------- You ask whether we can generalize the preorder "divides", referred to in the first paragraph of section 5.2 as a relation on elements of commutative rings, to noncommutative rings. We can, once we decide what definition to use. One says that x "right divides" y if we can write y = ax, and that it "left divides" y if y = xb; each of these is a preorder; they correspond to the inclusion relations on the left ideals Ry, Rx, respectively the right ideals xR, yR, generated by our elements. One could similarly call x an "interior divisor" of y if y can be written axb, and this would also be a preorder, though I have never seen this considered in ring theory. (This relation is not equivalent to inclusion between the 2-sided ideals RxR and RyR, for the reason mentioned in the two sentences preceding Ex.5.12:2. The relation of inclusion between RxR and RyR would yield still another "divisibility-like" preorder on elements.) ---------------------------------------------------------------------- You ask about the motivation for the concept of Gelfand-Kirillov dimension, developed in Exercises 5.2:2-9. It originates in ring theory rather than monoid theory; cf. discussion in the paragraph preceding Exercise 5.2:5. If one looks at easy examples such as polynomial algebras k[x], k[x,y], etc., one finds that the first grows linearly in i, the second quadratically, etc.; and for more complicated structures, commutative or noncommutative, one encounters similar patterns -- the dimension tends to grow either as i^d for some d, or exponentially in i (e.g., for a free associative algebra). GK(R) is defined so as to "capture" the number d such that R grows like i^d if there is one; it gives infinity if one has exponential growth. If one tries, one finds that one can construct noncommutative rings for which GK(R) is not an integer or infinity; but it still gives some real number. Exercise 5.2:8 shows that the same growth rates occur for algebras as for monoids, so in a text like this, which assumes little ring-theoretic background, it is convenient to devote most of the development to the monoid case. ---------------------------------------------------------------------- You note that the well-ordered sets (Definition 5.3.2) can be characterized as the totally ordered sets all of whose reverse-well ordered chains are finite, and ask whether there is a nice characterization of of the totally ordered sets all of whose reverse-well ordered chains are countable, noting that this class includes the well-ordered set of real numbers. I don't know. I wonder whether they are those that can be embedded in a lexicographic product \alpha\times\mathbb{R}, where \alpha is an ordinal, and \mathbb{R} is the ordered set of real numbers? (Here one could replace \mathbb{R} by, say, the open or closed real unit interval (0,1) or [0,1], since the former is order-isomorphic to \mathbb{R}, while \mathbb{R} and [0,1] are mutually embeddable.) ---------------------------------------------------------------------- > ... You mention in the paragraph after Exercise 5.3:3, > that some arguments for uniqueness of a differential equation > solution use connectedness. ... The simplest case is the equation y' = 0. On the real line, the set of solutions is the set of constants, while on a domain like (0,1) \cup (2,3), the solutions will be constant on each connected component, but can have different values on the two components, giving a 2-dimensional space of solutions instead of a 1-dimensional space. From this, one gets similar behavior for equations y' = f(x) for any continuous f(x); and one gets analogous results for higher order equations; though, depending on the nature of the equations, there may or may not be complications in the existence and uniqueness results other than those resulting from non-connectedness. But differential equations are far from my field; so I'm definitely no source of expert knowledge on the subject, or on how the experts look at it. ---------------------------------------------------------------------- Regarding transfinite induction (a term not used in the notes), you ask whether this refers to induction, in the sense of Lemma 5.3.4 over some ordinal. Right. > Also, I would enjoy seeing an example of some proposition that can be > proven via general or transfinite induction, but not via standard > induction on N. Exercise 5.3:5 proves a standard result on symmetric polynomials by induction over a well-ordered set which is isomorphic to an ordinal > \omega, so it can be thought of as a transfinite induction. There will be more examples in section 5.5, where we will study ordinals. (The definitions of ordinal arithmetic in (5.5.7-9) and (5.5.10) are by transfinite recursion, and so one proves results about these by transfinite induction; it is also used in proving Lemma 5.5.12(ii).) In the next chapter, it is used in proving Lemma 6.2.1; and still later, in section 9.2. ---------------------------------------------------------------------- You ask about my statement at the end of the 4th paragraph before Corollary 5.3.6 that because of the axiom of regularity, we can make set-theoretic constructions recursively, and whether without that axiom, we have to use methods other than recursion. What I meant was, "by recursion with respect to the membership relation", since regularity shows that this has DCC. Without regularity, one can still do recursion with respect to any index set having DCC, and this is still an important tool in set theory. ---------------------------------------------------------------------- In connection with the statement (three paragraphs after display (5.4.1)) that there is no easy way to extend the definition of an ordered pair as the set \{\{X\},\{X,Y\}\} to a definition of ordered n-tuple for larger n, you ask why we can't just define the ordered n-tuple (x_1,...,x_n) recursively as the ordered pair ((x_1,...,x_{n-1}),x_n) when n > 2. Nice! I hadn't thought of that. However, there are still a couple of advantages in to the approach described in the text. On the one hand, it has the useful property that if an m=tuple (x_1,...,x_m) and an n-tuple (y_1,...,y_n) are equal, then m=n and x_i=y_i for all i; while the approach you suggested would have every n-tuple also being an m-tuple for all m How do we formalize the notion of a construction that may not > necessarily be a function within ZFC? ... We formalize it as a rule which describes, in every case, what the value should be. This is like the situation to which the Axiom of Replacement is applied. > If the set of functions F is fixed, it seems like we should be able > to treat r as a function from a subset of X x F -> {r(x), f_{ \aleph_\alpha is not itself a function in our set theory. ---------------------------------------------------------------------- You ask about the significance of singular cardinals (Def. 5.5.18). When dealing with a cardinal \kappa, one can usually say that the union of a family of <\kappa sets, each of cardinality <\kappa, is itself of cardinality <\kappa. (For instance, taking \kappa = \aleph_0, this is the assertion that a finite union of finite sets is finite.) The singular cardinals \kappa are precisely the exceptions. Fortunately they are, as I mention, sparse, so if one wants to use that principal in the proof of a theorem, one can throw in the hypothesis "Let \kappa be a regular cardinal" and one hasn't lost much. As an example, if you look at p.2 of my paper with Shelah at http://math.berkeley.edu/~gbergman/papers/Sym_Omega:2.ps , you will see, among the people we thank, Peter Biryukov. We don't say there what we thank him for. It was he who pointed out to us that the assertion we make in the paragraph at the bottom of p.3 was not true as we originally formulated it, without the condition of regularity. ---------------------------------------------------------------------- In connection with Theorem 5.6.2, you ask whether, since every set is in bijective correspondence with an ordinal, this can be described as a universal property of ordinals. Not in any way that I can see. The word "universal" has various uses in math; for instance, "\forall" and "\exists" are respectively called the "universal" and the "existential" quantifiers; so there are very likely some statements using the word "universal" that follow from the above fact about ordinals. But what are called "universal properties" involve existence of unique maps. ---------------------------------------------------------------------- In connection with the suggestion preceding Exercise 5.6:2, that if you haven't seen proofs by Zorn's Lemma before you might look at such proofs in standard graduate algebra texts, or ask your instructor for some elementary examples, you ask for such examples. Most basic graduate-level algebra texts have many such proofs, but it takes some work to find where they are. One text where one can find these by an online search is Hungerford's "Algebra". If you go to https://books.google.com/books?id=e-YlBQAAQBAJ and type "zorn" in the box saying "Search inside", you will get 23 results, and turning to the corresponding pages in a physical copy of the text, you can find the proofs in question. However, I suspect you can do *most* of Exercises 5.6:2-5.6:14 without looking at additional examples. It is unfortunate that I began that string of exercises with 5.6:2, which, in its present form, does not make it at all obvious how Zorn's Lemma would be used. So look at the next few, and see whether you have better luck. ---------------------------------------------------------------------- You ask about the intuitionists' rejection of the Law of the Excluded Middle, i.e., the assertion that if a statement has neither a proof nor a counterexample, it will be neither true nor false (mentioned in section 5.7); and you ask whether there are in fact any such statements. I know of all this only from hearsay; I haven't studied the history. But I would think that, first of all, they would have objected to the application of the Law of the Excluded Middle to a statement for which no proof or counterexample was *known*: if we don't know a proof or counterexample, and can't prove that one of these exists, then they would insist that we can't say that the statement must be true or false. And there are plenty of things for which we don't know a proof or a counterexample. (As mentioned in a related context in Wednesday's class, I think that their attitude came out of "logical positivism", which says that a statement is only meaningful if one can describe a test that will determine whether it is true; which I think was a reaction against philosophers' use of vague terms without giving satisfactory definitions of them.) As for whether there in fact exist statements that can neither be proved nor disproved by a counterexample -- Goedel's Incompleteness Theorem shows, roughly, that given any sufficiently strong mathematical language and any set of precise rules for reasoning about the statements in such a language, there will be statements that can neither be proved nor disproved. But typically, these statements are equivalent to assertions "For all integers n, a certain computation gives a certain result"; so if they can't be disproved,then in particular, they can't have counterexamples, so the proof of Goedel's result proves that they are true, even though that can't be proved using whatever precise rules of reasoning were assumed given. Whether similar results have been obtained that don't, in this way, show that a statement is "really" true or false, I don't know. But I certainly think it is plausible that there are mathematical statements whose truth or falsity can't be established in any way. ---------------------------------------------------------------------- Regarding section 5.7, you write, > ... You say anything proved within our systems may model > the real world ... What I meant is that if we set up a mathematical model of some aspect of the real world, say in terms of differential equations, and we ask a question about how that model behaves, and answer it with the help of the Axiom of Choice, then assuming the question is equivalent to the question of how some numerical computations come out (say computations that approximate the differential equation more and more closely using finite-difference equations), then what we deduce using the Axiom of Choice must be consistent with the results of our computations, and so must represent the behavior of the real world with as much accuracy as our model does. > My question is do you know any examples where it is "convenient" to > accept AC ... I think the statement that every vector space has a basis is such a result -- it allows us to picture "exactly" what all linear maps between two vector spaces look like. For instance, we can say that given vector spaces V and W, any homomorphism f from a subspace of V into W can be extended to a homomorphism from all of V into W. The existence of such extending homomorphisms may themselves not be useful facts about the real world, but it shows us that knowing that such a homomorphism f can be extended, there is no point in looking for restrictions that this implies on f; and it is useful to know what not to waste our time on. (In contrast, the statement that, say, an additive homomorphism of abelian groups f: 2\Z --> \Z can be extended to all of \Z does restrict f: such an f must have image consisting of even integers, though a general map 2\Z --> \Z need not.) > ... and also do you think it would be "inconvenient" to work > in a system either without AC or one with the negation of AC? Yes. ---------------------------------------------------------------------- You ask about the relation between the concept of lattice defined in section 6.1, and the use of that term that you were familiar with, for a discrete subgroup of R^n. I think that the concept you are referring to and the concept defined in section 6.1 are each named based on the way a picture representing the mathematical structure looks. The Oxford English Dictionary's first definition of lattice is "A structure made of laths, or of wood or metal crossed and fastened together, with open spaces left between". If you look at some of the pictures at https://en.wikipedia.org/wiki/Lattice_(order)#Examples , you'll see how these resemble such a structure. On the other hand, although a picture of the concept you described might not include line-segments, it is suggestive of the real-world lattices that people build in its repeating regularity; this is reflected in the OED's definition 4.a for "lattice": "Any regular arrangement of points or point-like entities that fills a space, area, or line; spec. a crystal lattice or a space lattice; ...". (Russian has two competing words for the concept defined in section 6.1: "struktura", which simply means "structure", and so is ambiguous, and "reshotka", meaning "sieve", which has the above pictorial quality. I don't know what Russian uses for the concept you described.) ---------------------------------------------------------------------- You ask why I emphasize "pointwise" in the second paragraph after Definition 6.1.4. Hard to remember exactly what was in my mind when I put in that emphasis. I guess the idea was that in a given set S of functions, two functions f and g may have a least upper bound, i..e, a least member h of S that is everywhere \geq f and \geq g, without our being able to say much about this h; and that the reader might carelessly think that this is all that "the maximum of f and g" referred to if they missed the word "pointwise". I tend to be a sloppy reader in such ways, and assume my audience is also likely to be. And in general, when a concept that hasn't come up previously in what one is doing is introduced, it is useful to bring it to the reader's attention, and not let it get passed over unnoticed. ---------------------------------------------------------------------- You ask why, as mentioned in the 2nd paragraph before Exercise 6.1:2, some people write lattice operations using the symbols for addition and multiplication. I'm not sure, but I can make several guesses. I don't know when the symbols \vee and \wedge were introduced; there may have been a time when they were not common, and people simply tried to choose existing symbols with the closest meanings. "+" is natural for "putting things together"; moreover, for unions of subsets of a set, if we think of those subsets as represented by {0,1}-valued functions on the set, the union can be thought of as "addition as integers, with 1 made a ceiling"; while in most natural contexts, meets are intersections, which correspond to products of {0,1}-valued functions. Even after \vee and \wedge had been introduced, some people may have simply stuck with the symbols they had learned first. Also, for a long time we didn't have computers on which to compose mathematics, and typewriters generally didn't have special symbols, but did have +, while "xy" didn't require any symbol. Finally, some people may use "arithmetic" symbols because they feel it valuable to stress the analogy between lattices and rings; one can speak of "ideals" in a lattice, for instance (subsets closed under internal joins and meets with arbitrary elements). Even though these don't play the role of determining the structure of the image of homomorphisms, as they do in rings, they have some uses. ---------------------------------------------------------------------- You ask why 0 and 1 are used for the least upper bound and the greatest lower bound of the empty family (as indicated in section 6.2) rather than "some other notation (possibly involving $\infty$)". In the lattice of subsets of a set, looked at as \{0,1\}-valued functions, the empty set is the constant function 0, and the total set is the constant function 1. Also, just as in rings, 0 and 1 are the neutral elements for + and ., so in lattices, a least and a greatest element will be neutral elements for \vee and \wedge. ---------------------------------------------------------------------- You ask, in connection with the symbols "0" and "1" for the least and greatest element in a lattice having these, noted in section 6.2, whether this is related to the fact that in a ring, 0 and 1 generate the smallest and largest ideals. The use of "0" and "1" is certainly related to the ring-theoretic and ideal-theoretic analogies; in particular, the case of Boolean rings; and that case is in turn related to the fact that in the set 2^X of subsets of a set X, identified with their characteristic functions, the empty set and the whole set correspond to the constant functions 0 and 1. ---------------------------------------------------------------------- You ask why the fact noted following Exercise 6.2:5 that infinite meets and joins in a complete lattice are not operations of a fixed arity is a problem for complete lattices and not for <\alpha-complete lattices in general. For <\alpha-complete lattices, one can regard it as a problem, but a problem with an easy solution, noted in the middle of the paragraph in question: Regard such objects as having a set of "meet" and "join" operations, one for each arity <\alpha. The difference in the case of unrestricted complete lattices is that the resulting system of operations will not form a set, since there is not a set of all cardinals. Given a particular such complete lattice L, one can argue that one doesn't "need" operations of arities bigger than card(|L|). But in many situations one isn't _given_ L, one wants to construct an L, or say whether there is an L with a particular property, so one can't know in advance a certain set of operations that will be sufficient. And this is exactly what goes wrong in the matter that paragraph refers us to, Exercise 8.10:6(iii). You also suggest regarding meet and join as unary operations on P(|L|). Well, the theory of sets with two unary operations would then be applicable to that structure; but that isn't the same as the theory of L as an algebra. ---------------------------------------------------------------------- You ask why, in the paragraph preceding Exercise 6.2:12, I attach importance to \omega^X being a full direct product. If we try to extend the result of the preceding exercise to non-complete lattices L, we find that we cannot in general map any lattice of the form P(X), i.e., 2^X, onto it by a complete upper semilattice homomorphism. But we could if we allowed ourselves to use for our domains complete upper subsemilattices of P(X). (Just embed L in a complete lattice L', find a map f of some P(X) onto L' as in the preceding exercise, and then note that f^{-1}(L) is a complete upper subsemilattices of P(X) which f maps onto L.) So getting a surjective map on a general subsemilattice of a direct product is easier than getting such a map on a full direct product; and my comment notes that we are not taking that easy way out, here. ---------------------------------------------------------------------- You ask whether there is a characterization of "cocompact" elements (sentence after Exercise 6.2:15) in the lattice of subgroups of a group. Well, one context where such a concept comes up is in module theory. A nonzero module is called simple if it has no proper nonzero submodule, and the submodule of a module M generated by all its simple submodules is called the "socle" of M. One can show that the zero submodule of M is cocompact in the lattice of all submodules of M if and only if every submodule of M contains a simple submodule, and the socle of M is finitely generated. In particular, looking at Z-modules, i.e., abelian groups, it follows that the zero subgroup of an abelian group A is cocompact in the lattice of all subgroups if and only if every element of A has finite order, and A has only finitely many elements of prime order. I believe the above fact on modules over a general ring is somewhat useful in module theory. But I don't know a criterion for a general submodule N of a module M to be cocompact in the submodule lattice. One cannot say that this will hold if the zero submodule of M/N is cocompact in the lattice of submodules of M/N. For instance, let A be the abelian group of exponent p which is free as a Z/pZ-module on a basis x_0, x_1, ..., x_n, ... and B the submodule of A consisting of the elements in which the sum of the coefficients of the above basis elements is 0. Then A/B =~ Z/pZ, so the zero element in the submodule lattice of A/B is certainly cocompact. But B is not cocompact in the submodule lattice of A. To see this, look at the submodules A = A_0 > A_1 > A_2 > ... where A_i is generated by all basis elements x_j with j\geq i. We see that none of the A_i contains B, but their intersection does, proving noncocompactness of B. I haven't thought about the corresponding questions for nonabelian groups. ---------------------------------------------------------------------- > In Definition 6.3.2 you say the "class of subsets". What does > this mean? It means "set of subsets". In general, "class" has a wider meaning than "set", as discussed in the next-to-last paragraph of section 5.4 (i.e., the next-to-last of the paragraphs preceding Exercise 5.4:1). But as noted in the paragraph after that one, it is also used in contexts where the only relevant classes are sets, so in these cases it means the same as "set". ---------------------------------------------------------------------- In connection with the concept of a ring with involution (mentioned in the paragraph before display (6.4.1)), you ask about the etymology of the word "involution". Never thought of that! I looked it up in the OED. It seems that "involution" is a noun from the root of "to involve", and most of the nonmathematical meanings that it gives have the idea of entanglement. It gives three mathematical meanings: An old one, which apparently meant raising a number to a power; a second one, which I was familiar with as a map of the plane into itself which in appropriate polar coordinates has the form (r,\theta) |-> (r^{-1},\theta) (though they only refer to functions of the line into itself), and finally "A function or transformation that is equal to its inverse." The OED is not big on explaining how meanings come from each other. My guess is that "raising to a power" arose from the idea of a number being "entangled with itself", and is unrelated to the other two (though the geometric sense could be somehow related to r^{-1} being a power of r, or, if one reverses the sign of \theta, to taking inverses in the complex plane). I think that the (r,\theta) |-> (r^{-1},\theta) sense might have come from a biological sense that they show, "A rolling, curling, or turning inwards" on the part of an organ. This fits with the etymology of "in+volu-" = "in-turning". Something that "turns inwards" often turned inside out, and if "involution" came to have that meaning, it would easily fit (r,\theta) |-> (r^{-1},\theta). Then this could have been generalized to any function of order 2, giving their final sense; and then specialized within ring theory to an order-two map with the property I state. (By the way, the map (r,\theta) |-> (r^{-1},\theta) has an interesting geometric property: it takes \{lines and circles\} to itself. And on points with rational values of r, it preserves the property of having rational distances from each other (by an easy observation involving similar triangles). So it is a useful tool in studying families of points with rational distances among them.) ---------------------------------------------------------------------- You ask how the exchange axiom, (6.4.1) is used in showing that bases of a vector space all have the same number of elements. More precisely, that axiom is used when we know that one of the bases is finite. (An entirely different method is used when both are infinite, which calls on the fact that all vector-space operations are finitary.) The idea is as follows: Suppose B_1 and B_2 are bases of V, with B_1 finite. We do induction on the number of elements belonging to B_1 but not to B_2. If that number is 0, we are done, since one basis can't be properly contained in another. If not, let z\in B_1 not be a member of B_2, and let X = B_1 - \{z\}. Since X does not span V, its span (closure) must miss some y\in B_2. Applying the exchange condition to this X, y and z, one can deduce that (B_1 - \{z\}) \cup \{y\} is again a basis of V; and it has the same number of elements as B_1, but more elements in common with B_2 than B_1 did, allowing us to complete the induction. Check out the linear algebra text where you first saw the uniqueness of dimensions proved, and see whether the argument there is "essentially" the above. ---------------------------------------------------------------------- You ask about the relationship between the concept of Galois connection in my notes (section 6.5), and the one at http://en.wikipedia.org/wiki/Galois_connection . Note that in Exercise 6.5:2, I give an equivalent description of a Galois connection, and in the second half of that exercise, I generalize it to partially ordered sets. This corresponds to the "Alternative Definition" in the Wikipedia article, which they call an "antitone Galois connection". Their definition of a "monotone Galois connection" is simply a Galois connection in that sense between a poset A and the opposite of a poset B. ---------------------------------------------------------------------- Regarding Lemma 6.5.1, you write > The first couple conditions given in this lemma look like theorems > of intuitionist logic regarding negation: > > A -> B <=> ~B -> ~A > A => ~ ~A > ~A <=> ~~~A > > but without the classic > ~~A => A > > which would be analogous to the lemma condition (ii) being > equality rather than inclusion. Is there a connection here? I don't know. I haven't studied intuitionist logic; but it sounds interesting. Maybe one has a Galois connection on propositions, given by "is incompatible with under intuitionist logic" ... ? ---------------------------------------------------------------------- > In example 6.5.6 what is a radical ideal? In a commutative ring R, the radical of an ideal I is the set of all elements r\in R such that some power r^n lines in I. This is itself an ideal. A "radical ideal" is an ideal that is its own radical; i.e., that contains r if it contains r^n for some positive integer n. In the context of 6.5.6, note that if a polynomial f has the property that f^n(a_1,...,a_n) = 0 for some (a_1,...,a_n)\in \C^n, then f(a_1,...,a_n) = 0 for the same (a_1,...,a_n). Using this observation it is not hard to see that for any subset A of \C^n, the set of polynomials A* is a radical ideal of the ring of polynomials. ---------------------------------------------------------------------- You ask about generalizations of the duality on convex sets that I describe in Example 6.5.7; in particular, in the case of polyhedra in R^3, mentioned in class. It looks to me as though it should be possible to generalize the duality to nonconvex polyhedra X whose faces don't contain the origin, 0. Namely, write the plane of each face F_i of X in the form f_i(x) = 1 where f_i is a linear functional, and take the vertices of the dual to be these points f_i. Let two vertices f_i, f_j be connected by an edge in the dual if the faces F_i, F_j meet at an edge in X, and let the dual have a face with vertices f_{i_1},...,f_{i_k} if the faces F_{i_1},...,F_{i_k} meet in a vertex in X. This duality wouldn't correspond in any way I can see to a Galois connection, but it looks fairly easy to work with. However, there would be complications: a non-convex polyhedron can have more than one face lying in the same plane, and this would lead to the dual polyhedron having vertices that have to be "counted more than once". So one would have to set up a theory of polyhedra with vertices possibly counted more than once, and if one thinks about it, the same phenomenon for edges, and probably faces. One could also approach the construction more abstractly, using a formal description of a polyhedron in terms of abstract vertices edges and faces, with an incidence relation. I don't know just what properties the incidence relation should be assumed to have, but the properties would probably be self-dual, and so allow dualization. You write that you tried something like that out for a torus, and it seemed to be self-dual. This is probably because the Euler characteristic, V - E + F, is unchanged under interchanging V and F. But this wouldn't work in higher dimensions; first, because in that case the Euler characteristic doesn't completely determine the structure of the manifold, and second, because in even characteristic, dualization would change the sign of Euler's formula. ---------------------------------------------------------------------- Regarding Example 6.5.8 you ask about my assertion that when X is a ring of abelian-group endomorphisms of M, so that M is an X-module, then X^* is the ring of X-module endomorphisms of M. Well, have you written out, on the one hand, the condition for t\in T to belong to X^*, and on the other hand, the condition for t\in T to be an X-module homomorphism, and compared them? If you did, but don't see why the resulting properties should be equivalent, then send me the conditions you have written down, and I'll say more. ---------------------------------------------------------------------- You ask about the assertion in the paragraph following Exercise 6.5:8 that the set of propositions implied by s \vee t is the intersection of the set of propositions implied by s and the set of propositions implied by t. Well, let me know how far you were able to get. There are two parts to such a statement of equality: that any proposition implied by s \vee t must be in that intersection, and that any proposition in that intersection must be implied by s \vee t. Can you prove either one of these statements? (If you have trouble with one of the directions, you might ask yourself "What might examples of propositions s, t and another proposition p for which the desired implication doesn't hold look like?") When you've gotten as far as you can with this, let me know what you see and what you don't see, and I'll help. ---------------------------------------------------------------------- Regarding Galois connections (section 6.5) you ask under what conditions the closed sets of one of the resulting closure operators will be the closed sets of a topology. Since the class of closed sets under a closure operator is automatically closed under arbitrary intersections, the conditions that have to be satisfied are that the empty set be closed, and that the union of two closed sets be closed. (The latter is condition (b) of Exercise 6.3.15.) There are nice conditions one can assume on a relation R \subseteq S x T that will imply these properties. To make the empty set closed, one can assume that there is an element of T that relates to no elements of S. To make unions closed, one can assume that for any two elements t_1, t_2 \in T, there is an element t_1\vee t_2 \in T, such that the elements it relates to under R are precisely those to which either t_1 or t_2 relates. (Cf. first display after Exercise 6.5:8.) This may seem unnatural -- it does not hold in "typical" Galois connections -- but neither do "typical" Galois connections have the property that unions of closed subsets are closed. An example where it does hold, other than languages with an operator \vee and their models, is Example 6.5.6 (points of complex n-space, and the polynomials that are zero at those points). For any two polynomials t_1 and t_2, their product can be used as "t_1\vee t_2"; and the Zariski topology on complex n-space arises from this Galois connection. The sufficient conditions described above are not necessary. For instance, to get the empty set to be closed, it suffices that for _each_ element of S there be an element of T which does not relate to it; and one can similarly weaken the condition that leads to finite unions. However, my guess is that the properties I've described will tend to give the most natural cases where the closed sets under the Galois connection form the closed sets a topology. ---------------------------------------------------------------------- You ask, concerning the first sentence of Example 6.5.9, where I refer to "objects of this sort", whether this means sets with operations of a given signature; and also whether there are any restrictions on the language in which the propositions comprising T are expressed. My intent was to be very general. Rereading that Example, I think that in the first line, after "Let S be a set of mathematical objects", I ought to add, "of a given sort (e.g., of groups, of positive integers, of topological spaces, ...)". Then, hopefully, the sense of "of this sort" later in the sentence will be clear. The propositions can likewise be in any language -- all that is needed to get a Galois connection is that for each object s\in S and each proposition t\in T, it makes sense to say whether s satisfies t. The point of this Example is show that the basic ideas of studying sets of mathematical objects determined by propositions that they satisfy, and sets of propositions determined by the mathematical objects that satisfy them, can be looked at as an example of the concept of Galois connection. Of course, when one wants to study such a situation further, one will generally want restrictions of one sort or another. For instance, the paragraph following Exercise 6.3:8 discusses certain properties that follow if the language in question includes the operators "or" and "and" (interpreted in the standard way), and T is closed under those operators. The material of Chapter 9 concerns the situation you asked about, where S consists of all sets with operations of a given signature. There, T consists of all identities in those operations (but does not contain expressions formed with "or". Whether we allow "and" (written "\wedge") doesn't really matter, because, e.g., if t_1, t_2, t_3 are identities, then the set of algebras satisfying all members of, say \{t_1\wedge t_2, t_3\} is also the set satisfying all members of \{t_1, t_2, t_3\}.) ---------------------------------------------------------------------- You ask about the relation between closed subsets, i.e., sets fixed under "**", where "*" is the operator in a Galois connection, and topological closure, such as comes up in Galois theory of infinite field extensions. Well, closure operations occur throughout mathematics, as the examples given in section 6.3 show. It happens that in one area, topology, one deals with a closure operation that is simply called "closure". This is perhaps what led you to look at the word "closed" in that way. I guess when a closure operator is not finitary, i.e., when the closure of the union of a chain of subsets can contain more elements than the union of the closures of the sets in the chain, then the way these new elements arise is often given by a topological closure. So when this happens, as in the field extension case you mentioned, topology will be involved. I don't know whether every closure operator on a set can be decomposed somehow into a finitary closure operator and a topology ... . But anyway, as Exercise 6.5:3 shows, every closure operator on a set can be looked at as arising from some Galois connection, but as Exercise 6.3:17(i) shows, those closure operators that come from topologies alone are highly restricted. ---------------------------------------------------------------------- You asked about my use of the phrase "(generally infinite)" in the fourth-from-last paragraph of chapter 6. The reason I put that phrase in was, of course, that we usually see the conjunction symbol used to connect two propositions, or, if several conjunction symbols appear, finitely many; and I wanted to make clear that the finiteness restriction that is automatic in such cases was not being assumed there. I could equally well have expressed this by writing "(possibly infinite)". In choosing to say "generally", I was implicitly assuming that in mathematics, infinite structures (such as the ring of integers, or the real numbers) are more often of interest, and finite structures a more special case. But for mathematicians who specialize in the study of finite objects (e.g., finite groups, finite lattices, etc.), the reverse is true. So there was no absolute justification for my choice of word. ---------------------------------------------------------------------- You write that you have heard that some mathematicians remove the existence of identity morphisms from the definition of category (Definition 7.1.2). I don't recall hearing that. Can you point me to an example? You say that this is as reasonable as treating semigroups and monoids along with groups. I'll agree that it is as reasonable as considering semigroups -- but not that it is as reasonable as treating monoids! The definition of "monoid" embodies the natural structure on the set of endomorphisms of a mathematical object. There are indeed cases where the definition of a semigroup describes a structure that one wants to deal with; but these come up only in more complicated situations: when one wants to look at endomorphisms of a mathematical object that satisfy some restriction which respects multiplication, but isn't satisfied by the identity map. E.g., "all maps of the infinite set X into itself that have finite image", "all non-one-to-one maps of X into itself", etc.. Generally, these are not "stand-alone" examples; rather, they occur as subsemigroups of monoids (in the above two cases, the monoid of all maps of X into itself). Why don't I similarly introduce "nonunital categories"? There are endless tangents one can go off on, and one has to limit what one covers, both for reasons of time, and to give a unified subject matter that the student can absorb. If it seems from the notes that I am inclined to go in all directions, this is illusory -- I give very varied examples of the major concepts (such as "universal constructions") so as to provide a full perspective for understanding them. But in the basic concepts that I am presenting, I try to stick to the important ones, and not throw in less important variants. After learning about categories in the standard sense, the student who has reason to study structures that are essentially subsystems of categories closed under composition but not under containing all relevant identity morphisms can easily do so. ---------------------------------------------------------------------- Regarding constructions like G_{cat} for G a group, and P_{cat} for P a partially ordered set (in the 6 paragraphs preceding Exercise 7.2:1), you write > ... the emphasis on relating the structure of certain categories > to the structure of mathematical objects such as monoids, groups, > partial orders, etc., is much greater than in previous introductory > texts which I have read ... > > Is this primarily intended as a way of providing many "concrete" > examples ... or ... will it become a useful mathematical tool ... I would say that my primary motivation was neither of these: it was to show that categories are "the same sort of things" as groups, partially ordered sets, etc.. To which I will add that it is equally important to see categories as being "of a different sort", in that they can represent in one entity the vast array of all structures in a field of mathematics. But the way categories are used makes the latter viewpoint clear, while the viewpoint of them as mathematical structures like groups, monoids, etc. often gets overlooked. It is worth having both complementary understandings. Secondarily -- yes, these constructions give a nice class of examples of categories; examples different from the sort that one usually sees. And finally, there are some uses for such examples. For instance, we will see that if G is a group, then an "action" of G on an object of a category C is equivalent to a functor G_cat -> C; so constructions like "the fixed-point subobject of the action of G on X" will be expressible as an instance of the "limit" of a functor. ---------------------------------------------------------------------- You ask about the meaning of "isomorphic" in exercise 7.2:1. An isomorphism i: C --> D of categories, like an isomorphism between other mathematical objects, means a way of mapping the elements comprising C bijectively to the elements comprising D so that the structure is exactly preserved. In detail, this means a bijection i_{Ob} of object-sets, and for each pair of objects X, Y \in Ob(C), a bijection i_{X,Y}: C(X,Y) --> D(i_{Ob}(X), i_{Ob}(Y)), which respects composition (and identity morphisms, though that follows from the other conditions). ---------------------------------------------------------------------- You ask about the omission of arrows representing composite morphisms in diagrams of categories (e.g., in the paragraphs following Exercise 7.2:1). In general, diagrams that we draw show morphisms that are going to be discussed, and that are not merely composites of other morphisms shown. If we tried to draw all the morphisms in a category, the result would usually be far too complicated and confusing to the eye. Our pictures simply show the key things we need to focus attention on. We don't _always_ omit all morphisms that are composites of others that we show. E.g., in the diagrams in Proposition 4.3.3 and its proof, we showed the diagonal arrows even though they are (after the fact) composites of the horizontal and vertical arrows. But this is because conceptually they were given before the vertical arrows, and the properties characterizing the vertical arrows required the diagonal arrows to state them. So we omit composite arrows when we can. My showing the diagonal arrow in the first display after Exercise 7.2:1 was exceptional -- based on the very introductory nature of this section. ---------------------------------------------------------------------- You ask about naturally occurring examples of composition of relations (2nd paragraph after display (7.2.1)) other than the case of functions. Well, a lot of the things that are loosely called functions but aren't really can be thought of as relations. In calculus texts one sees "the function 1/x from real numbers to real numbers", but it is not a function because it isn't everywhere defined; and in some contexts one talks about "multivalued functions", such as "+- sqrt x". The obvious way of "composing" these corresponds precisely to composition as relations. Phrases like "is a friend of a client of --" can be thought as the composite of the relation "is a friend of" and "is a client of". But mostly, I would say that if and when one has a question about composition of relations, one can use the definition itself, and gain experience with the concept by applying it in trying to answer one's question. It isn't a major topic of this course, so there will be few such questions here. (I have a preprint which considers, among a number of other structures, the monoid of self-relations on a set, under the composition operation, about which there are some open questions; you can look at it at http://math.berkeley.edu/~gbergman/papers/embed.pdf .) ---------------------------------------------------------------------- You ask about the term "partial operation" used two paragraphs above Exercise 7.2:2. A "partial function X --> Y" means a function from a subset of X to Y. E.g., in Math 1A, when one speaks of "the function 1/x" or "the function sqrt x", these are partial functions from the real line to the real line. A partial binary operation on a set X is a partial function X x X --> X. In particular, if X is the set of all germs of analytic functions at points of the complex plane, then one has a partial operation of composition, since one can sometimes compose the germ of a function f at a point z_1 with the germ of a function g at a point z_0, namely, if and only if g(z_0) = z_1. As stated, these are exactly the cases needed to make "GermAnal" a category. ---------------------------------------------------------------------- I'll somewhat arbitrarily put this question about "empty composites" with material related to section 7.3, though it was actually asked much later. > ... Is there a nice way to define the composite of a collection > (or sequence?) of functions such that the composite of the empty > collection is the identity map? ... Well, see what you think of this definition. Suppose we are given n+1 objects X_0,...,X_n in a category C, and for 0\leq i < n, a morphism f_i: X_i -> X_{i+1}. We want to give the simplest possible definition of their composite, a function X_0 --> X_n. So we will recursively define \prod_{i=m-1} ^{0} f_i: X_0 --> X_m. The recursive step will obviously be \prod_{i=m} ^{0} f_i = f_m (\prod_{i=m-1} ^{0} f_i). What should we take as the base step? The naive answer would be to make the base step the definition of the product with m=0 by (\prod_{i=0} ^{0} f_i) = f_0. But I would say that a more elegant solution is to define the empty subproduct of this chain of morphisms, \prod_{i=-1} ^{0} f_i, as id_{X_0}: X_0 -> X_0. That way, "f_0" gets introduced at the m=0 recursive step just as each other f_i gets introduced at the m=i step. How does that look? ---------------------------------------------------------------------- You ask how one can allow a member of Ar(C) to belong to more than one hom-set (one page into section 7.3), given that they are drawn as arrows with definite source and target. The fact that we draw them that way isn't part of the definition of a category! It is simply a convenient way that we picture morphisms. So it is our right to draw diagrams that way that one might question, not whether one can allow morphism-sets to overlap. To the question "How can we justify drawing diagrams with each arrows having a source and target, when a given element may lie in more than one morphism set?", I think the right answer is that the arrow f we draw from X to Y represents f "regarded as a member of C(X,Y)"; and if we want to formalize that concept, we can do it by saying that the arrow really represents the 3-tuple (X,Y,f). This is not really different in nature from such questions as how we can justify writing the composite of elements f and g of a group G as fg, given that the underlying set |G| admits many group operations, and the product will be different in one than in another. The answer to that one is that fg is our shorthand for \mu_G(f,g), and it is safe to use such shorthand when we are not explicitly dealing with more than one group-structure on the same set (or on groups with overlapping underlying sets). > ... Also, since the text will not require that hom-sets be > disjoint, what advantages will this give? ... So far as I am concerned, only the advantage of not alienating people who are used to the definition saying that a function f: X --> Y is a subset of X x Y. To such people, the ordinary systems of sets and maps that they are used to would not form categories if we used the more restrictive definitions. Too many categorists don't care about that -- they take the attitude "We know the right way to do things", don't try to make them intelligible to the general mathematical community, and wonder why category theory is underappreciated! But once one is doing category theory, the things one is interested in from one point of view can always be translated into a variant language; so a student who has read this text should have no trouble adjusting to the axiomatics of a text that assumes hom-sets disjoint. ---------------------------------------------------------------------- In connection with the discussion on the last two pages of section 7.3 of attitudes about categories, you ask whether category theory "has any content", or is, as Wittgenstein said of logic, merely "a tautology". Well, it has been said that all of mathematics consists of tautologies. Insofar as that is true, it is true of category theory in particular. A tautology is a statement that is automatically true; and it is usually thought of as therefore being a statement that is obviously true (such as "X = X"). But a statement can be automatically true without being obvious; and I think the nontrivial results of mathematics fall into this category. So being tautologies does not keep them from being powerful and useful additions to human knowledge. ---------------------------------------------------------------------- You ask (in connection with the discussion at the end of section 7.3) whether category theory is essentially a language in which to say things about existing fields of mathematics, or is a field with nontrivial content . I say, ask yourself that question at the end of this course! ---------------------------------------------------------------------- You write that you feel that the Axiom of Universes (section 7.4) doesn't seem as believable as the other axioms, which you feel are clearly true. Well, I don't consider the axioms of set theory to be "true" or "false" (cf. section 5.7); I would judge them in terms of whether or not they form a useful model for the way we think about collections of entities, which enables us to reason precisely about these. Regarding the Axiom of Universes, the first few paragraphs of section 7.4 give reasons for setting up a set theory in which universes exist: that if we start with a set theory that merely satisfies ZFC, we would like to be able to talk about the collection of "all sets"; but that won't be a set. If we set up assumptions that that allow us to treat these classes just like sets, why not rename them sets and assume ZFC applies to these new things we are calling sets? And if we do this once, why not allow the process to be iterated indefinitely, and express this as the Axiom of Universes? We can never get away from the problem that *all* the things we are considering sets will be a collection that is not a set; but we'll have a system where the damage that that fact does has been essentially eliminated. Finally, as I said in class (I don't know whether you were already away at that point), most of the Axioms of ZFC consist of weakenings of the Axiom of Abstraction (described about one page after the list of axioms of ZFC). The general rule seems to be that any weakening of that axiom is OK if it doesn't allow us to define a set S in a way that requires one to already "have" S available to consider in applying the criterion for membership in S. And the Axiom of Universes is OK by that standard: each universe is built up from sets constructed "before" it. So -- since the Axiom of Abstraction seems intuitively "almost true" -- it is reasonable to accept this instance of it as "true". > ... Is it 'reasonable' to say that 'small sets' in the 'usual > sense' correspond to elements of some such minimal universe? Well, I'd rather not make such a convention. ZFC doesn't preclude the existence of universes, so if you make the above convention, then things that people just assuming ZFC called "sets" would not necessarily be "small sets in the usual sense" under your convention. Moreover, set theorists like to study "large cardinal axioms", e.g., the existence of a "measurable cardinal", and my understanding is that almost all (all?) large cardinal axioms imply the existence of some inaccessible cardinals, equivalently, of some universes (though not necessarily the Axiom of Universes). ---------------------------------------------------------------------- Regarding the development of the Axiom of Universes in section 7.4, under which a universe is a set, and every set is a member of a universe, you ask, "Don't these imply that every collection of sets is a set which is a contradiction?" No, nothing in the axioms implies that every collection of sets is a set. Under the axioms, for each set X there is a universe which contains X; but that universe will vary from one set X to another. So there is no assertion that one universe contains all sets. ---------------------------------------------------------------------- Regarding the discussion at the beginning of section 7.4, you ask > ... you redefine "set" to "small" and "large" set. Then, later, you > mention the lack of a "set" in ZFC that satisfies being a universe, > though the class of all sets would be. I am understanding this to be > in the old sense? The term "set" will always refer to a conventional > set, or will set be used to encompass large and small set? Actually, the third paragraph of section 7.4 was just meant to lead the reader to the ideas developed in what followed; so when I said "So let us change their names ...", I really meant "So suppose we changed their names ...". As of the next paragraph, we begin formally setting up what we really do. What we talked of loosely as "old sets" and "new sets" are now both sets within the set theory that we are discussing. We no longer need to consider large sets "things we used to call classes"; though we can still say that the members of U form a self-contained set theory, from within which the things not in U look like "classes that aren't sets". I hope this helps. Let me know if you still have difficulty with this. ---------------------------------------------------------------------- Regarding the idea of fixing some universe U, and considering those objects of a given sort (including categories) that lie in U, as described following Definition 7.4.1, you write: > ... I'm having trouble understanding why we should expect any > set of categories to be in this "standard" universe U. ... If U contains, say, some set X of groups, then it will also contain the category whose objects are the members of X, and whose morphisms are the group-homomorphisms among these. And if U contains a set Y of sets of groups, it will contain the set of categories whose members are the categories constructed as described above from the members of Y. Let me know whether you have any difficulty with proving these statements, and/or if you have difficulty seeing that any universe U will contain sets of groups, and sets of sets of groups. Intuitively, ZFC was set up to handle "everything that mathematicians ordinarily do", and it does this quite well. Forming objects constructed as ordered tuples, sets of mathematical objects, sets of homomorphisms among them, etc., are among these things; so your understanding of ZFC should include, at least in sketch form, an understanding of how these things are done. Since categories are defined as tuples with certain properties, ZFC can handle these equally well; and since every universe satisfies ZFC internally, these properties will hold in any universe. The one Achilles' heel of ZFC is the impossibility of defining "the set of all --", where "--" is not restricted in terms of some given set. The Axiom of Universes partly overcomes this: it allows us to speak of "the (large) set of all (small) --"; and that's what we must do when we define things like "the category of all groups". But if you merely want to get some, and indeed, lots of categories within U, that's easy: Just start with some set of groups etc. within U and as described in the preceding paragraph, form the corresponding category. There are other sorts of categories that arise in ways different from "mathematical objects of a given sort and morphisms among them", as discussed in the paragraphs before Exercise 7.2:1. These are easy to apply in any universe as well. ---------------------------------------------------------------------- In connection with Definition 7.4.4, you asked whether the concept of an object such as a group G being U-small referred to the whole structure, e.g., the group (|G|, \mu_G, \iota_G, e_G), being a small set (a member of U), or just the set |G|. Those two conditions are equivalent, since once |G| is U-small, the map \mu_G, as a map |G| x |G| --> |G|, and hence a subset of (|G| x |G|) x |G|, will also be U-small, and likewise for \iota_G and e_G; hence the 4-tuple (|G|, \mu_G, \iota_G, e_G) will be U-small. Anyway, the intended meaning (in case there are cases where the equivalence is evident) is that whole structure (|G|, \mu_G, \iota_G, e_G) is a U-small set. ---------------------------------------------------------------------- Regarding the notion of a large group, mentioned in Exercise 7.4:1(ii), you ask "Do examples of this naturally appear? Are there interesting theorems about these or are they essentially the same as small groups?" Remember that a group that is "large" with respect to one universe will be "small" with respect to another universe. Generally, one will study a given group within a universe to which it belongs, and there it will always be "small". Anyway, it's easy to get groups of arbitrarily large cardinalities; e.g., free groups on big sets; and a group of large cardinality won't lie, even up to isomorphism, in a universe of smaller cardinality. Perhaps your question really means "Are there interesting results that hold for groups lying in some universes that don't hold for groups lying in others?" Well, set-theorists are interested in "large-cardinal axioms", and if we consider a universe determined by a cardinal of the sort that one of those axioms concerns, some set-theoretic statements will be true that are not true in a sub-universe not satisfying the same large-cardinal axiom. Probably there are ways of encoding some of those set-theoretic statements in terms of the existence of groups with specified properties. But I'm not familiar with the field, and can't say whether such results would be of group-theoretic interest. ---------------------------------------------------------------------- Regarding my comment near the end of section 7.4 that if the Axiom of Universes should not prove adequate for future needs, one might assume a "Second Axiom of Universes", you ask "Can we make a Third, Fourth, or nth Axiom of Universes?" Certainly; but I don't see the point. The Axiom of Universes is simple, and does what we wanted and more. There's no reason to assume that the next challenge to the adequacy of set theory, if there is one, will come from the same direction; so rather than barricading ourselves against danger from that direction, we should just be on the lookout for what may come. (If it does come from that direction, we could just add such axioms.) > Another way of saying this would be: how long of a chain of > universes (ordered by "\in or =") can we get? ... Well, as you note, using the Axiom of Universes, one can get chains of universes as large as any ordinal (in any of those universes)! > ... the following reasonable-sounding axiom: "given any collection > of sets, there is a universe containing all of them," ... That won't work: "all sets" is a "collection" of sets, but since a universe is itself a set, we can't have a universe containing them all. ---------------------------------------------------------------------- Regarding the comment just before Definition 7.5.4 that "faithful" and "full" aren't the only analogs of "one-to-one" and "onto" that can be considered for functors, you ask what some of the others are. There are none that come up often enough that I knew their names; but looking online, I see that a functor F: C --> D is called "representative" or "dense" if for every object Y of D there is an object X of C such that F(X) is isomorphic in D to Y; a kind of "onto-ness" property. One could consider the "one-one-ness" property of taking non-isomorphic objects to non-isomorphic objects, but I haven't found anyplace where that is given a name. There are lots of adjectives used to describe kinds of functors; but most of the properties in question are of different sorts from one-one-ness and onto-ness. ---------------------------------------------------------------------- Regarding Definition 7.5.7 you ask > ... is it meaningful to speak of a hom-functor for a non-legitimate > category? In this case, the C(X,Y) are not sets. Yes they are!! I think you were in class Monday, when I emphasized that "large sets" are still sets within our version of ZFC -- they just aren't members of whatever universe U (itself a set) we happen to be focussing on; but they do belong to some larger universe U' by the Axiom of Universes. As noted in the second paragraph after Exercise 7.4:6, our version of set theory still has the property that "all sets" don't form a set. But "large sets" that we talk about definitely are sets. ---------------------------------------------------------------------- You comment that antihomomorphisms of groups should have a similar role in group theory to contravariant functors (Definition 7.6.1) in category theory, and you ask why antihomomorphisms of groups are rarely talked about. Every group has a canonical antiautomorphism, the map x |-> x^{-1}; so to give an antihomomorphism a: G --> H is equivalent to giving a homomorphism, f: G --> H defined by f(x) = a(x)^{-1}. So anything one might want to express in terms of antihomomorphisms can be expressed in terms of homomorphisms. Monoids and rings don't have such canonical antiautomorphisms, so one does, occasionally, look at antihomomorphisms among them. In particular, one often talks about involutions of rings (see 2nd half of paragraph before (6.4.1)). Aside from this, though, on those occasions when antihomomorphisms of rings R --> S come up, one most often writes them as homomorphisms R^{op} --> S; I guess because, as things that don't come up often, it is more comfortable to describe them in terms of things one uses regularly. On the other hand, contravariant functors are very common in mathematics, so one refers to them as such. ---------------------------------------------------------------------- You ask whether the Galois correspondence (I guess you mean the correspondence between subgroups of a Galois group and intermediate fields) is a functor (section 7.5). Well, it can certainly be made a functor by regarding the subgroups of the Galois group as forming a category with inclusions as morphisms, and similarly for the intermediate fields: they give anti-isomorphic partially ordered sets P and Q, which translates to a contravariant functor (section 7.6) P_cat^op --> Q_cat. There is a more sophisticated category-theoretic approach to Galois theory which you might find more interesting. I don't know the details, but Lenstra used it in teaching Math 250A one year. A lot of the students found it very difficult, and since I taught 250B the following semester, I ended up having to re-teach them Galois theory the traditional way. But Lenstra's notes from his 250A are online at http://websites.math.leidenuniv.nl/algebra/topics.pdf , and you might want to look at them. I think the Galois theory itself begins around the bottom of p.114, though it depends on lots of ring- and category-theoretic preparation in the preceding sections. ---------------------------------------------------------------------- Regarding the concept of a faithful functor (Definition 7.5.4), you ask > ... if we have a faithful functor from X to Y and a faithful functor > from Y to X, (and, say, both are injective on the object-sets) > are they isomorphic? Nope. There are counterexamples with 1-object categories X and Y, where the monoids of endomorphisms of the unique objects are in both cases abelian groups. (Not finite abelian groups, of course.) Can you find such an example? ---------------------------------------------------------------------- You note that for the definition of a product category C = \prod C_i in Definition 7.6.4, the Axiom of Choice guarantees that Ob(C), as defined, will have at least one element, but you ask whether it need have any others, and how we can be sure that it is "what we want it to be". Well, by its definition, a product set is always "what we want it to be"; the function of the Axiom of Choice is to guarantee that it will have the properties we expect it to. That axiom tells us here that if the categories C_i all have nonempty object-sets, then so will C. If they each have just one object, then of course Ob(C) = \prod Ob(C_i) will also just have one. You should be able to prove from ZFC that if a family of nonempty sets does not consist wholly of 1-element sets, then their product has more than one element; as well as such statements as that if the family is infinite, and each member has more than one element, then the product set has at least 2^{\aleph_0} elements. ---------------------------------------------------------------------- > ... you mention that in categories there is no real way to > distinguish isomorphic objects. Is there any danger, then, of > always looking at the skeleton category, which identifies > isomorphic objects? ... From an abstract point of view, no. But in specific cases, it could well be confusing. For instance, suppose we are looking at the category Set, and are considering the ordinals as objects of that category. Then the countably infinite ordinals (of which we know there are uncountably many) all become isomorphic. Suppose we go to a skeleton category of Set in which the only representative of that isomorphism class is \omega. We could still think about the chain of inclusions of countable ordinals by taking a retraction of Set to a skeleton, and looking at the images of the inclusion maps \alpha --> \beta among the countably infinite ordinals as certain endomorphisms of the object \omega to which they have all been retracted; and we could, for instance, describe the first uncountable ordinal as the direct limit of that chain of endomorphisms of \omega. But it would be much easier to think of the category Set in which the countable ordinals are separate objects. ---------------------------------------------------------------------- Regarding Exercise 7.7:2, you ask why this doesn't show that the category theoretic notion of an isomorphism between objects deviates from the normal mathematical notion, since in concretization T, T(f) is a bijection, but in the others it is not. I think you're assuming that the "normal mathematical notion" of an isomorphism is a homomorphism that is bijective on the underlying set. But that works only for those sorts of objects where such maps have the property that their set-theoretic inverses are also morphisms. For a case where that is not so, see Exercise 5.1:1. You'll see there that the concept of isomorphism agrees with the category-theoretic version, not with that of being a bijection on underlying sets. ---------------------------------------------------------------------- Regarding the concept of epimorphism (Definition 7.7.2), you ask, > ... Is there an accessible paper which gives precise criteria for > epimorphisms in the category of rings? Yes. See John Isbell's series of papers, Epimorphisms and dominions. 1966 Proc. Conf. Categorical Algebra (La Jolla, Calif., 1965) pp. 232-246, Springer Epimorphisms and dominions. II. J. Algebra 6 (1967 7)-21 Epimorphisms and dominions. III. Amer. J. Math. 90 (1968) 1025-1030 Epimorphisms and dominions. IV. J. London Math.Soc. (2) 1 (1969) 265-273 Epimorphisms and dominions, V. Algebra Universalis 3 (1973), 318-320 But (as Isbell was always concerned with pointing out), the statement of the criterion, his "Zigzag Lemma", in the first of the above papers is wrong; it is corrected in paper IV. > Also, are the problems in characterizing epimorphisms for rings > similar to those for monoids? Yes. ---------------------------------------------------------------------- Concerning the statement following Definition 7.8.1 that the empty set is the initial object of Set, you ask "What is the morphism in C(\emptyset, X)?", We touched on this in reading #1: See the second paragraph of section 2.4, with the key words, "there is exactly one". I didn't go into details there, because I felt that the student who thought this through would see it. You should look at the definition of a function from a set X to a set Y, and ask yourself what satisfies that definition when X is the empty set. If you have trouble thinking this through, ask again. ---------------------------------------------------------------------- Regarding the concept of a free object with respect to a concretization of a category (Definition 7.8.3), you ask > Is there a generalization of free object for non-concrete categories? Well, a free group (etc.) is a group F(X) with a universal X-tuple of members of its _underlying_set_; so having an "underlying set" is part of the essence of the concept, and the category-theoretic abstraction of the underlying set is a concretization. But one can get various sorts of generalizations depending on how far afield one is willing to go, and still consider the result a version of the "free object" concept. For a small generalization, one can drop the faithfulness condition in the definition of concretization. E.g., the functor taking every group to the set of its elements of exponent 2 is not faithful, but there is an analog of the free group for that functor, namely the functor taking every X to the group presented by an X-tuple of generators, together with relations saying that all those generators (but not necessarily all other elements) satisfy x^2 = e. Much more loosely, one could call the result of any universal construction (or at least, any left-universal construction) a "free" object for the relevant conditions. And some authors do. In between these, one can consider the construction of the left adjoint of a functor, which we will see defined in section 8.3 to be (when it exists) a generalization of a free object construction. ---------------------------------------------------------------------- You ask about (co)products that are preserved under all functors (possibly assuming that the codomain category has coproducts). Well, this will certainly be true of the coproduct of any 1-object family! My guess is that it will not be true in any other cases; but I don't know. ---------------------------------------------------------------------- Regarding the concept of kernel in the 2nd paragraph before Definition 7.8.7, you ask > How closely does the categorical definition of kernel match our usual > meaning? I'm not aware of any categories with zero objects in which > the categorical definition differs from the standard definition, ... My first reaction was that one would have to come up with a pretty exotic category, and there would not be likely to be a "standard definition" of kernel there! However, your suggestion > ... but perhaps one could be concocted by taking some subcategory > of Ab in which not every standard kernel is an object ... works: if C is the category of divisible groups (Exercise 7.7:5), then the map Q --> Q/Z has zero kernel in C under our definition, but one would ordinarily say that it has kernel Z, which is not in C. ---------------------------------------------------------------------- In connection with the concepts of pushouts and pullbacks (Definition 7.8.7) you mention having seen pullbacks in algebraic geometry, and ask whether pushouts of schemes is equally important. Well, if you look at affine schemes, pushouts correspond to pullbacks of rings. In particular, given two subrings of a ring, the pullback of the diagram formed from that ring and those two subrings is the intersection of those subrings. But intersection does not respect the properties that algebraic geometers like, such as being Noetherian. (Can you find an example of a finitely generated Noetherian ring and two finitely generated subrings whose intersection is not Noetherian?) Intuitively, forming a pushout of schemes corresponds to gluing two schemes together in a manner prescribed by maps from a third; and this gluing process can create singularities. ---------------------------------------------------------------------- You ask what I mean in the first sentence of Lemma 7.9.4 by unordered pair of objects. "Ordered pair" is the standard term for the sort of entity that we write (x,y). If I said that there was no more than one morphism between an ordered pair (X,Y) of objects, this might be taken to mean that C(X,Y) had at most one element; but I want to say more: that C(X,Y) and C(Y,X) each have at most one element, and that they can't both have an element unless X=Y. So I use the phrase "unordered pair of objects" to mean "two objects, with no difference in the roles we assign them." I used the same phrase in Exercise 7.2:1, where I made a precise statement, then used this phrase as an informal translation. ---------------------------------------------------------------------- > What is the point of not assuming disjoint Hom-sets in the definition > of a category? As I say in section 7.3, I don't make that assumption "largely because it would increase the gap between our category theory and ordinary mathematical usage"; since under conventional definitions, where a map f: X --> Y is a subset of X\times Y, f doesn't uniquely determine Y. Actually, in a many-sorted algebra A with family of underlying sets (|A|_i)_{i\in I}, one generally doesn't assume the |A|_i are disjoint; otherwise every time one constructed such an algebra, one would have to check whether one's construction accidentally produce elements belonging to more than one |A|_i. So in studying such algebras, one doesn't want to put in such a requirement. (E.g., in studying actions of groups on sets, one might want to consider pairs G, S where G is a group and S a G-set; these would have two underlying sets |G| and |S|. But sometimes one wants to consider the action of a group on itself.) And a category can be looked at as a many-sorted algebra. So one might say that the choice is between optimizing things for the person who, given that C is a category, wants to say things about it, and for the person who, given some mathematical situation, wants to say "such-and-such is a category". For the former, it would be best that hom-sets be discrete; for the latter, that this not be required. ---------------------------------------------------------------------- You ask how one can prove the statement in the paragraph following Exercise 7.9:9 that "there is no natural way to make a contravariant functor out of P_f." As I use it, the word "natural" is an informal term, like "obvious" or "reasonable", so it doesn't require a proof. The sentence simply means that none of the ways that we discovered, when we discussed the power-set construction, to turn maps among sets into maps among their power sets, gives a contravariant functor that takes finite subsets to finite subsets. However, you might try investigating whether there is or is not any way to make the construction associating to each set the set of its finite subsets into a contravariant functor, and if you can answer the question either way, hand it in as a homework problem. ---------------------------------------------------------------------- Regarding the definition of equivalence of categories (Definition 7.9.5) you ask: > ... If F is covariant must G be too? And if F is contravariant > G also? Right. More precisely, by the last sentence of Definition 7.6.1, "functor" means "covariant functor" if the contrary is not stated. So the definition of equivalence should be interpreted with both F and G being covariant functors. Then a "contravariant equivalence" between categories C and D is an equivalence (in that sense) between C^op and D. > Is there an example in which FG=Id_D but GF is only isomorphic > but not equal to Id_C? Yes. Let C be a category, and D any skeleton on C. Let F be the functor determined as follows: For each object X of C let F(X) be the unique object of D isomorphic to X in C, and choose an isomorphism f_X: X -> F(X), using the identity isomorphism whenever X \in Ob(D). For h: X -> Y, define F(h) = f_Y h (f_X)^{-1}. (Draw the diagram to see how this works.) Let G be the inclusion functor of D into C. Then you'll see that FG=Id_D but GF is only isomorphic to Id_C. ---------------------------------------------------------------------- > Is the last condition in Lemma 7.9.6 what some people call > essentially surjective? Yes. I hadn't encountered the term, but doing a Google Book Search, I see that it is used that way. ---------------------------------------------------------------------- Concerning the statement after Definition 7.9.7, that the Axiom of Choice allows us to construct a skeleton for every category, you ask how we can do this for categories that are not small. It sounds as though you are still thinking in terms of the paragraphs of motivation at the beginning of section 7.4, which suggested that "small" and "large" sets might be used as new names for what had been called sets and classes. But what we moved on to in that section was a set theory in which "small sets" were sets in a given universe, and "large sets" were any sets within our set theory; and in which ZFC, and so in particular, the Axiom of Choice, applied to the set theory as a whole, not just to small sets. (We still have "proper classes", subclasses of the class of all sets, and we can't apply the Axiom of Choice or our other axioms to these. But a "large category" by definition has a _set_ of objects; it belongs to our set theory; it is merely not required to belong to the distinguished universe U within that set theory.) ---------------------------------------------------------------------- You write that the most intuitive definition of equivalence of two categories is that they have isomorphic skeletons (Lemma 7.9.8), and ask whi I didn't present this one first. Well, the definition I gave tends to be the one that comes up more in the motivating situations -- we have a way of constructing an object of D from any object of C, and vice versa; and while the composites of those constructions are not quite the identity functors of C and D, there are obvious isomorphisms of them with those identity functors. On the other hand, the skeleton of a category, while formally convenient, can be intuitively rather far from the original category -- e.g., in a skeleton of Group, we can't look at the various infinite cyclic subgroups of an infinite cyclic group Z as distinct groups -- they're "the same" group Z, mapped into itself by different morphisms. Anyway, as I've said before, it's good to have different ways of understanding the same mathematical concept. Which one gets introduced first is often a lesser matter. ---------------------------------------------------------------------- In connection with section 7.11, you write: > Another example of an enriched category: the hom-sets in the category > of sets with relations as morphisms have a boolean algebra structure. That's an interesting observation. But to make it an enriched category structure, one would have to figure out how composition of morphisms behaves on the given pair of Boolean algebras. It looks to me as though it will respect joins, but not meets or complements; so maybe it has to be weakened to an upper semilattice structure. ---------------------------------------------------------------------- Regarding the final paragraphs of section 7.11, you write: > ... you allude to a nontrivial morphism between morphisms between > morphisms in the category CatCat. ... Not to a nontrivial morphism, but to a nontrivial *concept* of morphism; i.e., a way of defining "morphism between morphisms between morphisms" that doesn't just reduce to something one can define in any category or Cat-category. Recall that in defining the concept of "morphism of functors", we used the fact that functors have objects as their outputs, and that there can be morphisms from the objects produced by one functor and the objects produced by another. Now if we have two morphisms P and Q between a pair of functors F and G between Cat-categories C and D, the morphisms that comprise P and Q may in turn have morphisms between them (because D is a Cat-category), and if some choice of these morphisms give the proper commuting diagrams, then we consider the resulting family of morphisms to be a morphism m from P to Q, i.e., a morphism m of morphisms P, Q of morphisms F, G between objects C and D of CatCat. ---------------------------------------------------------------------- I'll count this question as belonging to section 7.11: > ... Is there a way to regard a ring as a one-object category, much > as there is a way to consider monoids and groups? Yes. Rings are 1-object Ab-categories. (Recall that an Ab-category is a category whose morphism-sets all have structures of abelian groups, such that composition is bilinear.) ---------------------------------------------------------------------- > As is mentioned in Section 8.1, all universal objects are initial > objects in some category. But to show this, don't we need some > abstract way of constructing a category in which a universal object > is initial? The examples given in 8.1 seem rather ad hoc, as they > depend on the specific universal object being considered. Well, we don't yet have a general definition of "universal object"! In Chapter 4 we displayed a large bunch of interesting constructions which seemed to have a feature of "universality" in common; and in section 7.8 we found that some families of these could be gathered under common category-theoretic descriptions. In the first half of Chapter 8 we will look at these more systematically. From the point of view of section 8.1, "universal object" might be defined as "structure corresponding to an initial object in some category"; in later sections, we look at specific sorts of properties that can be described as in this way, but that have more natural descriptions in terms of other categories, and how to pass between the two sorts of description. ---------------------------------------------------------------------- You ask about my assertion in the paragraph preceding Exercise 8.2:2 that the functor U^\omega is represented by the free commutative ring on a \omega-tuple of generators. For any ring R, U^\omega(R) is the set of \omega-tuples of elements of R (see second sentence of Definition 7.8.5), and the ring with a universal \omega-tuple of elements is the free ring on an \omega-tuple of generators. (See discussion in the second paragraph section 8.1 of the free group on 3 generators as the initial object in the category of groups with specified 3-tuples of elements. As discussed at the beginning of section 8.2, this can be translated as saying that that group is a representing object for the functor associating to each group G the set of 3-tuples of elements of G.) ---------------------------------------------------------------------- Regarding the statement you read somewhere that Cayley's Theorem is a case of the Yoneda Lemma (Theorem 8.2.4), you write > ... Since the one object R of G_cat is just an abstract construction > I don't really understand what h_R means in this context. You can call that object "an abstract construction", but the definition of h_R still applies to it -- go to that definition, take R to be that one "abstract" object of G_cat, and see what object of what category h_R takes R to. Then remember that the concept of functor involves both objects and morphisms. So now check what Yoneda's Lemma says about morphisms in this case. ---------------------------------------------------------------------- You ask about the reason I reserve the word "free" for the construction of the left adjoint of an underlying-set functor (e.g., paragraph before Exercise 8.3:3), while other books you have read use it for more general left universal constructions. I can't be sure about the books you are referring to without knowing which they are and looking them over, but I suspect that they do not develop or assume known to their readers the general concepts of universal constructions, representable functors, and adjoint functors. I suspect that if they did, they too would use those terms in many places where they now use "free". Of course, since the word "free" is short and suggestive, there still might be a temptation to use it in place of the more technical-sounding terms. But despite this, I think that when one has a more precise language available, one will use it. It's no fault of those authors -- they are writing to an audience not familiar with the concepts of this course. ---------------------------------------------------------------------- You ask whether there is a connection between units and counits of adjunctions (Definition 8.3.9), and units and counits of algebras and coalgebras. Well, given an adjunction, if we write UF = T, then the unit gives us a morphism I --> T, and from the counit we can get a morphism T^2 --> T. In general, a functor T from a category into itself given with morphisms I --> T, T^2 --> T satisfying certain conditions is called a "monad" by some, a "triple" by others; and it is thought of as analogous to a monoid M, looked at as a set |M| with maps |M|^0 --> |M|, |M|^2 --> |M| satisfying similar identities. Since the map |M|^0 --> |M| is called the unit of M, the corresponding operation of a monad is called the unit of the monad. A dual sort of structure, called a "comonad", has a "counit". Adjunctions, being very symmetric, have both. The above connects the unit of an adjunction with the unit of a monoid. If one abstracts the definition of a monoid in terms of the category Set, then applying the same concept to the enriched category Ab, one gets the concept of a ring; or, using instead of Ab the category of k-modules, the concept of k-algebra, which you asked about. Dualizing, one gets the concept of coalgebra. I'm not sure whether one can actually bring the concepts of monoids and monads under one hat; I suspect so, namely that if M is a monoid, then the functor |M| \times -- : Set --> Set becomes a monad. Likewise for rings, replacing \times with \otimes: Ab --> Ab. ---------------------------------------------------------------------- Concerning the examples of adjoints given after Exercise 8.3:6, you write > Many of our universal constructions create some sort of "free object" > (free group on a set, free ring on a monoid, "free ring" (tensor ring) > on an abelian group, etc.). ... Well, people use the word "free" with various degrees of generality. The formal definition we have given corresponds to a left adjoint of a set-valued functor, and does not cover things like the monoid ring on a monoid, or the tensor ring on an abelian group. On the other hand, using the term still more widely, one often calls a group etc. presented by generating set X and relation-set R "the group freely generated by elements x of X subject to the relations R", though this does not correspond to an adjoint functor, since only one object and not a family are being constructed. Likewise, as noted in the reading, the tensor product construction, though it gives an abelian group "freely generated" by the image of a bilinear map on given groups, is not a left adjoint. So I would say that what you are referring to as "free" objects could be described by the term "left universal constructions", as in the text; and that a large class of these, but not all, are covered by the concept of left adjoint constructions. ---------------------------------------------------------------------- You ask how, two paragraphs before Exercise 8.3:7, I translate the question of whether the product functor C x C --> C has a left adjoint, into the universal-object question on the next page. I am essentially using the characterization of adjoint pairs of functors in Theorem 8.3.8(ii), in which one starts with the right adjoint U, and characterizes the left adjoint F object-by-object in terms of it. So if the product functor is to be a right adjoint, it will have the role of U, and the "C" and "D" of the theorem will be our C and C x C respectively; and the desired condition is that for every object of C (which I call X in the discussion your ask about) there should exist a pair (R_X, u_X) representing the functor C(X, U(-)). Writing R_X as (Y,Z), this means a pair of objects Y and Z with a universal morphism of X into Y x Z. ---------------------------------------------------------------------- Regarding Definition 8.3.9 you asked, > If we have three functors F,G,H, where G is a right adjoint of F > and a left adjoint of H, is there anything interesting we can say > about F and H? For instance, in the example you give with products, > coproducts, and the diagonal functor, the product and coproduct > functors are essentially dual to each other. That the product and coproduct are "essentially dual" just means that they are obtained by dual constructions -- left adjoint and right adjoint -- from the diagonal functor, which has no left/right asymmetry. It refers to the way we look at them, not to properties that they will have in a particular category. But the question of whether, when a functor has a left and a right adjoint, those two adjoints have properties relating them, is interesting. I don't know the answer, but I would guess that some properties of G, concerning what distinctions it preserves and what distinctions it loses, would force some dual properties on F and H, which they would share. But I don't know anything concrete about this. Alexey Zoubov has pointed out to me a result along these lines: F is full and faithful if and only if H is full and faithful. This is Lemma 1.3 in "Exponentiable morphisms, partial products and pullback complements" by Roy Dyckhoff and Walter Tholen, Journal of Pure and Applied Algebra, 49(1987) 103-116, https://doi.org/10.1016/0022-4049(87)90124-1 ---------------------------------------------------------------------- Regarding the p-adic numbers (section 8.4) you write > I am somewhat familiar with the Hasse local-global principle, or at > least the result, that an equation solvable over all p-adics and over > the reals has a solution over the rational numbers. Does this result > arise in any way from the sort of constructions we have encountered? > I'm curious if the p-adic rings, themselves inverse limits, have any > similar relations which are useful in proving such results. Well, there's an obvious approach to looking for necessary and sufficient conditions for something to hold: One puts together all the necessary conditions one can come up with, and hopes that when one has listed enough of them, their conjunction will be sufficient as well, and that one can express that conjunction in some concise form. Necessary conditions for an equation to have a solution in the integers are that it not contradict anything one can deduce either using congruences, or using inequalities. Consistency with what one can deduce using congruences comes down to solvability in each Z/nZ. But Z/nZ is isomorphic to the direct product of the Z/p^i Z for p^i ranging over the maximal prime-powers dividing n; so having solutions modulo all integers is equivalent to having solutions in all rings Z/p^i Z. For fixed p, the conditions of having solutions in Z/p^i Z for various i are not independent, due to the homomorphisms (8.4.3); but the conjunction of these conditions over all i is equivalent to the existence of a solution in the inverse limit of (8.4.3), i.e., the p-adics. So the conditions one can get using congruences can be concisely summarized by saying "there exist solutions in the p-adics for each p". On the other hand, the study of the equation via inequalities comes down to asking whether it has a solution in the reals. The principle you describe evidently says, "Yes, congruences and inequalities are together enough to tell whether an equation has a solution." Our construction of the p-adics can be thought of as a way of condensing the information about integers arising from the study of congruences. As we noted, the p-adics have no zero-divisors, though each Z/p^i Z does; so this way of condensing information brings a kind of elegance. (You state the principle for solutions over the rationals rather than the integers. I guess for that case one would look at "congruences of rational numbers modulo fractional ideals", and use the p-adic field in place of the p-adic integers. By clearing denominators, I think questions of solutions in the rationals can be reduced to questions of solutions in the integers, and the use of the p-adic field reduces the use of the p-adic field to the use of the p-adic ring.) ---------------------------------------------------------------------- Regarding germs of functions, mentioned in the first paragraph of section 8.5 as a motivating example for direct limits, you ask why "germs" are so called. My guess is that the term arose in complex analysis. There, if one knows an analytic function in a neighborhood of a point, no matter how small, that determines uniquely its extension to as large a connected domain as it can be defined on. The original meaning of "germ" is "sprout" (what one gets when a seed germinates); so I think the idea was that a "germ" of an analytic function was a tiny thing that had enough information to determine the whole thing. When one considers general continuous functions instead of analytic functions, the analogous entities no longer have the property of determining the value away from the point in question, but the word was probably carried over because the concept was useful. (And because the everyday sense of "germ", which refers to a microscopic entity, made it seem natural.) ---------------------------------------------------------------------- I meant to work the answer to your question into my lecture, but didn't get to it. You asked whether there is a notion of convergence associated with the direct limit. I think that the main connection is conceptual -- that if X is the direct limit of a family of objects X_i, then the successive X_i are "more and more like X". E.g., in the case of a direct limit of groups, they in general have more and more of the elements that will show up in X, and these satisfy more and more of the relations that are satisfied there. But it may be possible to turn this into topological convergence. If we define a language in which there are symbols for all the elements of X, and for the kinds of relations that these can satisfy, then we might define a topology on any set of objects of the indicated sorts, some elements of which are labeled with some of the element-symbols, taking for a subbasis of open sets the sets of objects characterized by having (or not having) an element symbolized by each symbol, and satisfying (or not satisfying) a given relation on such elements; and then I think the direct limit X would be the topological limit of the system of "points" X_i. (In an abstract category, we might do something similar using the existence of maps from various objects in the category satisfying various composition relations; but I haven't thought this through.) ---------------------------------------------------------------------- You ask whether, if a functor F and a subfunctor G of F both have a limit (Definition 8.6.1), the limit of G will be a subobject of the limit of F. This works for Set-valued functors, with "subobject" understood to mean "subset"; hence it works for categories of algebraic objects where limits can be constructed using limits of underlying sets. However, using different choices of what to call subobject, it can fail. E.g., if in Group we choose to define a "subobject" to be a subgroup of finite index, then it fails for infinite direct products. ---------------------------------------------------------------------- Perhaps in connection with Mac Lane's use of "complete", mentioned in the sentence before Exercise 8.6:2, you ask > For any category, can we construct a "completion" category > with all limits? There are such constructions, but I'm not familiar with the details. One obvious approach is to start with the Yoneda embedding of C in Set^{C^{op}}, note that, like any category of the form Set^D, the latter has limits, and close the image of C in that category under such limits. Another is to take a category whose objects are formal limits (one for each diagram whose limit one wants to allow), and let it have just those morphisms that the universal properties of limits require. Whether these construction would give the same result, I don't know. A problem is that a category may already have some limits, but our construction might create new limits that don't agree with these. For example, the category Set clearly has inverse limits; but suppose we embed Set in the category HausTop of Hausdorff topological spaces by giving each set the discrete topology. Now consider the system of sets which, for convenience, I will write as ... -> Z/8Z -> Z/4Z -> Z/2Z -> 0. (We're not interested in the group or ring structure; just in the fact that as one goes back a step, each point bifurcates.) In HausTop, its inverse limit is the Cantor set, a compact space. In Set, its inverse limit is the same space with the discrete topology. If we construct a universal completion of Set, then since HausTop is complete, the inclusion of Set in HausTop will induce a functor from our universal completion to HausTop which maps the inverse limit of the above system to the compact Cantor set, while since the set-theoretic Cantor set was already in Set, this will be mapped to the discrete Cantor set. Hence the limit of the above system in the universal completion will not be the same as its limit in the original category. Of course, one might choose to construct a completion designed to preserve those inverse limits that already existed, and which would be "universal" only for functors that preserve such limits. Looking online, I see that Grothendieck defined a completion "pro-C" for every category C. (For the "pro-", see paragraph before Ex. 8.5:18.) See also http://ncatlab.org/nlab/show/completion and http://mathoverflow.net/questions/59291/completion-of-a-category At some point I'll have to learn more about these things, for a paper I've been putting off writing for years. But not during a semester when I'm teaching. (Incidentally, where you wrote "limits", I'm not sure whether you meant inverse and/or direct limits in the sense of section 8.5, or the more general limits and/or colimits in section 8.6; but similar considerations should apply to both, though the details of the category one got would differ.) ---------------------------------------------------------------------- Regarding Proposition 8.6.3, you note > ... You write C^D(\Delta(-), F) : C^{op} --> Set, but why need the > target be sets? Is it assumed that C^D is legitimate? The C^D(\Delta(X), F) will be sets, all right, but I agree that they may not be small sets, unless we assume D small. Thanks; I'll put a note about this on the errata page. ---------------------------------------------------------------------- Regarding Exercise 8.6:6(ii) you ask how an initial object, defined in terms of morphisms to other objects, can be characterized as a limit, whose universal property asserts the existence of morphisms from other objects. If you look at the definitions of limit and colimit, you will see that each of these involves morphisms both into and out of the object in question. (Those morphisms have different roles in the definition; so the exercise requires relating morphisms that have one role in one definition to morphisms having a different role in the other.) ---------------------------------------------------------------------- Meant to answer your question in class, but I fell behind schedule in covering the earlier material. You asked, in the context of Theorem 8.8.9, whether one of the composite colimits in (8.8.10) could exist, and the other not exist. Yup. The example I've come up with is a direct limit, indexed by the natural numbers, of coequalizers in the category of finite sets. Let the n-th coequalizer diagram have for domain the integer n = \{0,...,n-1\} and for codomain n+1 = \{0,...,n\}, and let the two maps from the domain to the codomain be i |-> i and i |-> i+1. You should check that for each such diagram, the coequalizer is a 1-element set. On the other hand, let the n-th such diagram be mapped into the n+1-st by inclusion of domains and inclusion of codomains. Then if we take the direct limit over n of either domains or codomains, the result "wants to be" the set \omega, but this doesn't lie in FinSet; and it is not hard to show that no object of FinSet has the universal property of the desired direct limit; i.e., it does not exist; so of course there's no way to construct the coequalizer of these (nonexistent) direct limits. On the other hand, if one first takes coequalizers, getting 1-element sets, one sees that a direct limit of these exists, namely the 1-element set; so in that order, the colimit of colimits exists. ---------------------------------------------------------------------- You ask about the second sentence after Theorem 8.8.9, saying that if we assume that all functors from D to C have colimits, then the isomorphism between the left and right expressions in (8.8.10) becomes a case of Theorem 8.8.3. I see that that does need a bit of explaining! The first assertion of Theorem 8.8.3 says that if F: C --> D is a left adjoint functor, and S: E --> C has a colimit, then lim_E FS = F(lim_E S). (I will use "lim" in this e-mail for the colimit symbol, "lim" with an arrow "-->" under it.) In the application we want to make, the roles of C and D in that theorem are taken by C^D and C respectively, the role of F by lim_D, and the role of S by B(-,-), regarded as a functor E --> C^D. ("E" is the one symbol that translates to itself!) Then the formula lim_E FS = F(lim_E S) takes the form lim_E lim_D B(-,-) = lim_D lim_E B(-,-). (When I have time, after the end of the semester, I should add a clarification of this to the online revised version of the text.) ---------------------------------------------------------------------- You ask why I didn't simplify Definition 8.8.1 by putting Definition 8.8.13 before it. The main reason was that I felt the details of Definition 8.8.1 emphasized the concept; and I feel that it is often good to have to deal with a concept "by hand" before introducing the machinery that handles it slickly -- one appreciates the machinery, and has a better sense of what it does. There's also a technical reason. The morphism of Definition 8.8.13 only makes sense if we assume both (co)limits exist. But if we merely assume Lim S exists, then the cone of Definition 8.8.1 exists. In that way, 8.8.1 is simpler than 8.8.13: it simply merely assumes Lim S exists, and the condition it gives is that F(Lim S) be the desired limit of FS. ---------------------------------------------------------------------- In connection with the observation in the first paragraph of section 8.9 that direct limits commuting with products is what allows us to make a direct limit of algebras an algebra, you point out that a family of operations on a set can be regarded as a map from a coproduct of products of the set into the set, and you ask whether the fact that direct limits respect coproducts is useful in this connection. Interesting question. I don't see it as directly important -- it seems easiest to just regard each of the family of operations as carrying over to a direct limit. But if we consider a construction that does not respect coproducts, such as that of direct product, we get some useful insight. Consider, for utmost simplicity, algebras consisting merely of a set with two unary operations, \alpha and \beta. If X and Y are two such structures, then the direct product of their underlying sets, |X| x |Y|, can be made an algebra of the same sort in the obvious way, writing \alpha(x,y) = (\alpha(x),\alpha(y)) and \beta(x,y) = (\beta(x),\beta(y)). But regarding the combination of \alpha and \beta as maps X \coprod X --> X and Y \coprod Y --> Y, we see that together they induce a map (X \coprod X) x (Y \coprod Y) --> X x Y i.e., (X x Y) \coprod (X x Y) \coprod (X x Y) \coprod (X x Y) --> X x Y, i.e., four rather than two unary operations on X x Y, which turn out to be (x,y) |-> (\alpha(x),\alpha(y)), (x,y) |-> (\alpha(x),\beta(y)), (x,y) |-> (\beta(x),\alpha(y)), (x,y) |-> (\beta(x),\beta(y)). This kind of phenomenon is of interest -- e.g., if M and N are R-modules, though we usually make M x N an R-module, sometimes we prefer to make it an (R x R)-module instead. ---------------------------------------------------------------------- > In proposition 8.9.3, it says that "In Set, direct limits commute > with finite limits." This commutation should be understood as 'up to > isomorphism', correct? Well, limits and colimits are only defined up to isomorphism -- they are objects with universal properties, and if one object has the universal property, then anything isomorphic to it does. It is true that if we have specific constructions of the limit and colimit in one order and in the other, then the "commutativity" statement means that the two resulting objects are isomorphic (by an isomorphism that respects the appropriate structure). See Definition 8.8.1, last sentence and then first two sentences. ---------------------------------------------------------------------- Regarding Proposition 8.9.3, you ask > ... what other important categories have this nice property > that directed colimits commute with finite limits? Because finitary algebras define their operations using finite direct products (a finite limit construction), we will be able to prove in the next chapter, when we study the general concept of algebra, that direct limits of finitary algebras arise from direct limits of their underlying sets. That will be our main use of this result; but that fact can then be used to prove that the result you ask about holds for any variety of finitary algebras. ---------------------------------------------------------------------- > In proposition 8.9.11, you write of categories being > generated by a set of morphisms. Is this more generally a useful > way of thinking about a category, or is it mostly just the condition > which makes the proposition true? ... I think it can be useful when the category is used to "index" something; e.g., when it is a diagram category over which we will take a limit or a colimit. For instance, of the two categories illustrated by centered displays in the paragraphs following Exercise 7.2:1, the first one is generally pictured without showing the diagonal arrow, because that is the composite of other arrows, and the second one is shown (there and in general) without showing the composites of the short arrows, for the same reason. And the point is not merely that one gets a less cluttered picture, but that in defining a cone to or from such a category, it is enough to make sure the arrows of the cone make commuting triangles with the generating morphisms; it then follows that they make commuting triangles with all morphisms. Also, people sometimes look at how results on monoids can be generalized to results on categories; and how results on rings can be generalized to results on Ab-categories; and since generation properties are of interest in those fields, they should be of interest in the category-theoretic generalizations. However, I have not seen the concept used except by myself -- in this section, and a paper I wrote based on it. ---------------------------------------------------------------------- You ask about the statement preceding the display (8.9.5), that by the construction of direct limits in Set, there exist D(i)\in P and an element x_i satisfying that display. Well, we have just shown that the left-hand side of that display lies in the direct limit over D of the sets B(D,E_i). The construction of direct limits in Set describes that direct limit as an image of the disjoint union of the sets B(D,E_i). What I don't say in Lemma 8.5.3, but is implicit, is that the maps of the given sets into that image constitute the coprojections; in this case, t he q(D,E_i). So the desired element must be the image of some element of some set B(D,E_i) under its coprojection. For each i, we can name the "D" indexing that set as D(i), and we can name the element in question in that set x_i. That gives the display you asked about. ---------------------------------------------------------------------- Regarding the lines preceding Corollary 8.9.9, you ask what is meant by a good finite subset. The word "good" is in quotation marks, so I am not using it in any standard meaning. Rather, by a "good" finite set I mean a set that satisfies some properties that will be helpful in getting the desired conclusion. What those properties are is seen in the statement of the Corollary. ---------------------------------------------------------------------- You ask about the background of the term "solution-set condition" (introduced two paragraphs before Lemma 8.10.3). Well, if you want to describe, say, the group presented by a set X of generators and a set R of relations, then the functor that you want to represent is the one associating to every group the set of X-tuples of its elements which are solutions to the system of equations R. Moreover, the condition that is needed to prove that such a representing object exists is, in classical language, that a *set* of groups with such X-tuples exists that is as good as the *class* of all such groups and tuples. (In our language, for "set" and "class" read "small set" and "large set".) The above is my guess as to the background of the term. Unfortunately, mathematicians seldom say what leads them to choose a their terminology. It could, rather, be more like what you suggested: a (small) set that is a "solution" to the requirements of the proof. ---------------------------------------------------------------------- You ask (if I understand correctly) whether one can get the uniqueness of adjoints using the construction of Theorem 8.10.5. I don't think so. One of the marvelous facts about objects with universal properties is that one can obtain them in different ways, yet the different objects so obtained must be naturally isomorphic. But the ways of constructing them don't give those isomorphisms; that comes from the universal property. But your question adds the words "or other characteristics of possible adjoints"; and to this, one can sometimes give a positive answer. For instance, though no one in this class has yet seen what natural property the groups of (3.3.3) and Exercises 3.3:1-2 are universal for, once one does, the method of construction as a subgroup of a direct product does give one a bound on the order of the groups with that property. And studying which elements of the direct product lie in the subgroup, one can further improve that bound. ---------------------------------------------------------------------- Pointing to the contrast between Exercises 8.10:6 and 8.10:7, you ask about general results on universal constructions in classes of algebras with large sets of operations. I don't know whether such results have been looked at. In General Algebra, it is most natural to consider algebras with small sets of operations. Complete lattices are most naturally thought of as having meets and joins of arbitrary subsets; but operations on subsets don't fit the techniques of General Algebra. The best way one can accommodate them is to treat them as "large" families of operations on tuples of elements. As such, they are more like the kinds of structures studied in General Algebra than sets with operations on subsets would be, but they still push the envelope. I think "random" sorts of algebras with large classes of operations will "almost never" have free objects; it is only certain sorts that arise in special ways -- such as complete semilattices -- that have these. ---------------------------------------------------------------------- You ask why objects characterized by right universal properties are usually easier to construct directly than those characterized by left universal properties, as mentioned near the end of section 8.10. I think it is because the objects we look at in algebra are defined using maps on direct products of underlying sets, and direct products respect right universal constructions. As a result, right universal constructions on algebras turn out to be based on the corresponding constructions on their underlying sets. ---------------------------------------------------------------------- You ask about the discussion in the last two paragraphs of section 8.12 where I state that the concept of "Cat-based category" is invariant under reversing order of composition. First, let me be explicit about what this means: It means that if C is a Cat-based category, then C^{op}, defined by letting C^{op}(P,Q) = C(Q,P), is also a Cat-based category if one takes the same concept of morphisms-among-morphisms that one had before. (It also becomes a Cat-based category if one reverses the directions of morphisms-among-morphisms; but let's just look at one modification at a time.) The point is that if one has a concept of composition, given by morphisms C(Q,R) x C(P,Q) -> C(P,R), then one can regard these as morphisms C^{op}(R,Q) x C^{op}(Q,P) -> C^{op}(R,P), and the left-hand side can be rewritten C^{op}(Q,P) x C^{op}(R,Q), giving precisely the kind of map one needs to define a category structure C^{op}. As for the composition of morphisms among morphisms, this is based on the "op" functor, which goes from Cat --> Cat, not from Cat to Cat^{op}. Anyway, I suggest that to get the idea without the complications of Cat-based categories, you look at the very last (3-line) paragraph on the page, which talks about the "op" construction on ordinary (Set-based) categories, and note that the opposite of an ordinary category is again an ordinary category, not a Set^{op}-based category. ---------------------------------------------------------------------- Regarding the last paragraph of section 8.12, you ask "why do we need the product of Set to define "op" functor?" We use the product construction in Set in defining the concept of category, since "composition" is defined by maps C(Y,Z) x C(X,Y) -> C(X,Z) (where "x" here stands for "direct product of sets"). So when we define the opposite of a category, we have to see how to take such maps defined on product-sets of C 's hom-sets and turn them into maps on product-sets of C^{op} 's hom-sets. To fit the changed domains and codomains of our maps, this turns out to require reversing the order of "C(Y,Z)" and "C(X,Y)", and that uses the symmetry of the direct product of sets. ---------------------------------------------------------------------- You ask, regarding Proposition 9.1.6, "What goes wrong in the construction of colimits of algebras?" To answer that, I need some idea of how you think they would be constructed. So please let me know that, and I'll reply! ---------------------------------------------------------------------- Regarding Lemma 9.2.2, you ask whether, when \gamma is singular, the sequence of subsets S^{(\alpha)} always continues to grow past S^{(\gamma)}. Certainly not! For the easiest case, one can start with X = A; and then all S^{(\alpha)} are equal to A, i.e., the growth stops at the first step. By other constructions, one can obtain examples where the growth stops at any specified ordinal less than or equal to the value given by Lemma 9.2.2. ---------------------------------------------------------------------- Regarding Exercise 9.2:5, you ask, > ... what does compact as an element of the lattice of subalgebras > of A mean? Did you look up "compact" in the index? If, after doing so, you still have a question about it, let me know and I'll try to help. ---------------------------------------------------------------------- Regarding the concepts of identity and variety (Definitions 9.4.1 and 9.4.6), you ask > ... Could we interpret \Omega-algebras satisfying sets of identities > as structures satisfying certain first-order theories in languages > possessing only function symbols in their signatures? ... It's not only the signature that has to be restricted, but also the syntax: No existential quantifiers, no negation, no implication, no disjunction. Just universally quantified equations. (One could allow conjunction, since having a conjunction in one's theory is just equivalent to having each of the conjoined identities; but esthetically it seems nicest to leave conjunction out here.) Certain slightly more general languages give classes of algebras that can also be treated nicely. For instance, if one allows, along with sentences of the preceding sort, universally quantified sentences of the form (conjunction of equations) => (equation), then one finds that one still gets free algebras and algebras presented by generators and relations, and that a class of algebras defined by such sentences is closed under almost all (but not quite all) of the operations discussed in today's reading. Such a class is called a "quasivariety of algebras", and these are also studied in universal algebra. An example of a quasivariety is the class of torsion-free groups; another is the class of rings without nonzero nilpotent elements (elements satisfying x^n = 0 for some n). ---------------------------------------------------------------------- Regarding section 9.4 you ask > What sort of questions in specific theories like group theory or ring > theory do the results of this section help us to answer or rephrase > in a more manageable manner? None occur to me. I would say the question is like asking, "What sorts of problems about the complex numbers does the concept of a field help us answer?" The value of that concept is to generalize from specific structures, like the complex numbers, general properties that can be studied in a much larger class of cases. Results can then be -- and are -- proved about fields in general, and applied to the complex numbers in particular; but those results could, in principal, have been proved for the complex numbers alone, if we were sure that that field was all we would ever care about. Our goal from the beginning of the course was to see what was the common context in which results for varied classes of algebras such as groups, rings, etc., regarding free objects, construction by generators and relations, consideration of subclasses determined by identities, etc. could be proved. And we have just done that. The results of this reading may look dull, because they are things that we already knew for the specific classes of algebras that we are familiar with. In subsequent sections, we shall get past these "dull" basics, and the material developed will hopefully look more interesting. ---------------------------------------------------------------------- Regarding section 9.4 you note, > In chapter 1, one exercise gives an alternative formulation of the > concept of group: we can get away with just the one 2-ary operation > > delta(x, y) = xy^-1. > > This gives us two varieties that are basically the same: ... Actually, the description of groups in terms of the operation delta does not quite give a variety; if one just uses identities, then the variety one gets consists of structures corresponding to groups, and also an empty structure. The category of groups is equivalent to the subcategory of nonempty algebras in this variety. However, that quibble (which one can get around by throwing into the above variety a zeroary operation e and an identity delta(x,e)=x, or delta(x,x)=e) doesn't invalidate the point you make: > ... We could say the varieties are isomorphic as categories, but > that doesn't say it all, so is there a term for this "stronger than > isomorphism but not quite equality" of varieties? I'm not sure whether there is a standard term. In the language of sections 9.9-9.10, which we will read in about a week, the relation is that the two varieties of algebras have isomorphic "clones of operations". We will see that two varieties are related in this way if and only if there is an equivalence between them that respects underlying-set functors. Birkhoff named two algebras (as distinct from {\em varieties} of algebras) of possibly different types "crypto-isomorphic" if (in the above language) their clones of operations were isomorphic, which is equivalent to saying that the varieties they generate are equivalent in this way, via an equivalence that takes one algebra to the other. ---------------------------------------------------------------------- Regarding section 9.4 you ask, > ... Is there a general way in \Omega-Alg to express the things > where a property exists for some element. For example the groups for > which there is an element of order 5, but not necessarily satisfying > the identity x^5=e for all x? Well, elements g\in G satisfying g^5 = e correspond to morphisms from to G. Such an element will have order 5 if the morphism does not factor through the natural map to the trivial group. So one can describe groups having elements of order 5 by the condition that they will have such a non-factoring morphism from . But it isn't a very natural class of groups -- it isn't closed under homomorphic images or subalgebras; and in general, one can't do universal constructions in it. Perhaps the fact that one can describe it is what you are looking for; but giving such classes a name isn't likely to be helpful. There are two directions one can go from there to get more natural classes. One is to look at the category consisting of groups with a distinguished element g satisfying g^5 = e (dropping the requirement that this element not equal e). This is essentially a variety of algebras: We take the description of the variety Group, and adjoin one additional zeroary operation, specifying the distinguished element, and one additional identity, saying that the new element should have exponent 5. Homomorphisms between such structures should, of course, take distinguished element to distinguished element. The other is to to consider groups _not_ having any elements of exponent 5. That is not, I think, equivalent to a variety; it is defined by operations, and identities, and the additional condition (\forall x) (x^5 = e) => (x = e). Classes of algebras defined by such conditions -- identities together with universal equational implications -- are called "quasivarieties", and they have properties nearly as nice a varieties. This quasivariety is really the opposite of what you asked for (groups with an element of order 5); but perhaps you'll find it as interesting as what you asked for. I guess there's a third answer to your question. Model theorists consider any sort of family of first-order sentence, and look at the class of "models" of that family, and call them axiomatic model classes. So if we let the family be the identities for groups, together with the sentence (\exists x) (x^5 = e) and not-(x=e) then the resulting axiomatic model class is what you are asking about. But in considering such classes, model theorists are pretty far from universal algebraists. ---------------------------------------------------------------------- You ask whether one has an analog of Cayley's Theorem for objects of an arbitrary variety of algebras (Definition 9.4.6). The examples where we had "Cayley's Theorem" type results were all for classes of structures that were defined as models of certain sorts of mathematical phenomena; and in those cases, it turned out that the sorts of structures so defined modeled the phenomena in question sufficiently nicely that any structure fitting our definition could be realized in the way that motivated the concept. Though the majority of the varieties that algebraists study are models of general sorts of phenomena, an arbitrary variety is not required to be motivated in such a way; so there is no natural formulation of a "Cayley's Theorem". Even when a variety is so motivated, the characterization may not be good enough to give a "Cayley's Theorem". The result of this sort for Lie algebras is the Poincare-Birkhoff-Witt theorem, and this shows that every Lie algebra over a field can be represented by commutator brackets in an associative algebra over the same field; but as noted in the last sentence of Exercise 9.7:2, this is not true of Lie algebras over an general commutative rings. ---------------------------------------------------------------------- You ask about the terminology used in display (9.5.4), in particular, the terms "model" and "first-order theory". Model Theory studies structures consisting of a set given with a family of relations of various arities on it, some of which may be operations, and statements about it which can be expressed by formal sentences constructed using element-symbols, relation symbols corresponding to the given relations, and logical operations such "implies", "for all", "there exists", etc.. A "first-order sentence" allows these operations, but with "for all" and "there exists" applicable only to elements, not to relations. (So a statement like "X is infinite", though it can be expressed by saying "there exists a set of pairs of elements which satisfies the conditions to be a function X -> X that is one-to-one but not onto", is not equivalent to a first-order sentence.) Given a language involving certain relation-symbols, a "model" for that language is a set given with operations corresponding to the operation-symbols of the language; and the "theory" of a model or family of models is the set of those statements in the language which are true for all these models. A model of a theory is any model which satisfies all the sentences in the theory. Our concept of a variety is a restricted case of these concepts: The only relations we consider are operations, and the only sentences we consider are universally quantified equations (so, nothing with "not", "or", "there exists", etc.) ---------------------------------------------------------------------- Regarding the discussion of Vopenka's principle following (9.5.4), you ask what the "special properties" of the cardinal mentioned in the discussion there are. I'm afraid you'd have to ask a logician that! Sorry. ---------------------------------------------------------------------- > Does the term "derived operation" (definition 9.5.1) have anything > to do with derived functors and derived categories? ... I don't think so. Vague everyday words like "normal", "derived", "regular", etc. get borrowed over and over again into mathematics, with generally unrelated meanings. ---------------------------------------------------------------------- In connection with Proposition 9.6.3(iv), you ask why the endomorphism extending the set map v is not assumed to be unique. The condition "F is generated by the image of X" forces uniqueness. (In this context, it is strictly stronger than uniqueness: E.g., in the situation of Exercise 9.6:4(i), uniqueness holds, but the monoids in question are not free objects in a _subvariety_ of Group. So since we have to state the stronger condition of being generated by v(X), there's not point in bringing in the implied condition of uniqueness of extending homomorphisms.) ---------------------------------------------------------------------- In connection with Exercise 9.6:9 and the following discussion, you ask > Is it easy to find examples of rings that satisfy S_{2d+1} = 0 but not S_{2d} = 0 ... ? ... Hmm -- fiddling around, I think that S_{2d+1} = 0 is equivalent to S_{2d} = 0. Namely, if in S_{2d+1} we substitute 1 for, say, the last indeterminate, and evaluate S_{2d+1}(x_1,x_2,...,x_{2d},1), we find, I think, that those terms where the "1" appears in the last position give S_{2d+1}(x_1,x_2,...,x_{2d}), those where it appears in the next to last position give the negative of this, those where it appears in the third from last position give the S_{2d+1}(x_1,x_2,...,x_{2d}) again, and so on; and since there are an odd number of positions, we get exactly S_{2d+1}(x_1,x_2,...,x_{2d}). So the identity S_{2d+1} = 0 implies S_{2d} = 0. > ... Is there a theory of associative algebras with an extra > S_n-like operation, akin to Lie algebras? Not that I've heard of. > ... Is "S_4 = 0" the simplest nontrivial identity satisfied by the > 2 by 2 matrices? I believe it is the polynomial identity of lowest degree; but an identity that one may find conceptually simpler is (XY-YX)^2 Z = Z (XY-YX)^2; i.e., the square of every commutator is in the center. This can be proved for matrices over the complex numbers by noting that XY-YX has trace 0, and verifying that every matrix with trace 0 has square a scalar matrix. (That fact in turn can be "seen" from the fact that among matrices with a given trace, in particular among those with trace 0, the diagonalizable ones form a dense subset, and it is clear that the square of a trace 0 diagonalizable 2x2 matrix has scalar square.) Knowing the result for matrices over the complex numbers, and the fact that the field of complex numbers generates the variety of commutative rings, one can deduce that the identity holds for matrices over any commutative ring. ---------------------------------------------------------------------- You asked about the fact, mentioned following Exercise 9.6:10, that the concept of "heap", developed in that exercise, had being rediscovered many times, and what could have led to that repeated rediscovery. I think that two contrasting aspects of the concept of heap are relevant: That it is fairly natural and moderately useful, but that it does not invite study for its own sake, since the isomorphism classes of heaps (other than the empty heap) are completely determined by the isomorphism classes of the corresponding groups. Naturalness and usefulness leads people to discover the concept, but since the theory completely reduces to that of groups, not many works get written about them, and no one ends up specializing in them; so few people hear about the concept, and if they need them, they often end up reinventing them. When I rediscovered them I called them "isogroups"; see p.60 of https://link.springer.com/article/10.1007%2FBF02188011 , line after display (5). After that appeared, someone told me that they had already been defined. The same situation -- being natural and useful, but having a theory that essentially reduces to other theories -- is also true of preorders. I don't know whether they too have been rediscovered several times. If not, perhaps the situations in which they come up are sufficiently widespread that once a name was assigned to the concept, enough people heard of it to prevent the "rediscovery" syndrome. ---------------------------------------------------------------------- You ask about the type \Omega of the variety of Lie algebras (section 9.7) The list of operations begins with the structure of k-module, which has one 0-ary operation (the element 0), one binary operation (addition), and a unary scalar-multiplication operation for each element of k. (If one builds up the concept of a k-module starting from that of an abelian group, then one also has the unary operation of additive inverse, and I list that in (9.7.2). However, that is equivalent to the operation of multiplying by -1\in k, so it can be omitted, as I have done above.) Finally, in addition to all of these, there the binary operation of Lie bracket. ---------------------------------------------------------------------- Regarding the concept of a Lie algebra over a commutative ring k (section 9.7), you ask > ... What falls apart if k is allowed to be noncommutative? There is no appropriate way to define a bilinear map on a module over a noncommutative ring. In other contexts, the "correct" noncommutative generalization of a bilinear map of modules is to consider a right R-module M and a left R-module N, and consider a map b: M x N --> A, where A is an abelian group, and f(xr, y) = f(x, ry) for all y\in R. (More generally, M can be an (S,R)-bimodule, N and (R,T)-module, and A an (S,T)-bimodule, for rings S and T which may or may not equal R, and one can add the conditions f(sx, y) = sf(x, y), f(x, yt) = f(x, y)t for s\in S, t\in T.) However, condition (9.7.5), i.e., [x,y] = -[y,x], means that we can't distinguish "right factors" from "left factors" in the operation of a Lie algebra; and there is no decent version of bilinearity of a map of modules over a noncommutative ring that makes sense without such a distinction. But there is a concept that does combine a Lie algebra with a noncommutative algebra; I will mention it briefly in class: A "Poisson algebra" (over a commutative ring k) is a k-module together with two multiplications, one associative and one Lie, such that Lie bracket with every element is a derivation on the associative algebra structure. ---------------------------------------------------------------------- You write > In the middle of section 9.7, it is said there is a connection > between Lie algebras and Lie groups. Does this connection allow us > to identify an algebra with a (possibly) group or vice-versa? As the sentence preceding Exercise 9.7:6 suggests, it is a many-one relationship: The many different Lie groups that look alike in a neighborhood of the identity have the same Lie algebra. For each finite-dimensional Lie algebra L over the real numbers, there is a unique *simply connected* Lie group G having L as its Lie algebra. Other Lie groups having L as their Lie algebra can be obtained by dividing G by a discrete subgroup, and/or throwing in more connected components. For a trivial case: if L is the 1-dimensional Lie algebra, which by (9.7.3) has zero bracket operation, then G is the real line R, while other Lie groups with the same Lie algebra include the circle group R/Z, and various groups with R or R/Z as the connected component of the identity element. ---------------------------------------------------------------------- In connection with the discussion following Exercise 9.7:5 of how every Lie group gives a Lie algebra, you ask whether the reverse is true. Every finite-dimensional Lie algebra over the real numbers arises from a Lie group (necessarily of the same dimension). That group is not unique, but for any such Lie group, its universal covering space is again a Lie group, and is the unique simply connected Lie group, up to isomorphism, which gives the indicated Lie algebra. For the infinite dimensional case, one would need to specify a concept of infinite dimensional manifold to use in defining infinite dimensional Lie groups. I know that people do work with such concepts, but there are varied choices of definitions, and I don't know for what choices, if any, such a result has been proved. Infinite dimensional Lie algebras are perfectly natural algebraically, however, as one can see from the part of this section up to where the relation with Lie groups is introduced. ---------------------------------------------------------------------- You ask whether people study nonassociative algebras other than Lie algebras (section 9.7). Yes, but probably more work is done with Lie algebras than with all other sorts of nonassociative algebras combined. The next largest area is that of Jordan algebras, mentioned near the end of this section. Another is "power-associative" algebras; i.e., algebras whose 1-generator subalgebras are associative; these satisfy identities such as x(xx) = (xx)x. I have a few papers in which results are proved for nonassociative algebras simply because the arguments needed to prove certain things for Lie algebras (which were what my coauthor in the first two of those papers cared about) didn't really require the Lie identities. The results in the last paper in the group have the curious property that they hold for varieties of k-algebras whose identities have certain properties, satisfied in particular by associative, Lie, and Jordan algebras (and many more); but definitely not all such varieties. I can give you copies if you're interested. ---------------------------------------------------------------------- Regarding the concepts of a clone of operations (Definition 9.9.1) and a clonal category (Definition 9.9.5), you ask whether these concepts have applications to other fields of mathematics. The idea of a "clone of operations" is a generalization of specific concepts like "the set of all derived group-theoretic operations", "the set of all derived lattice-theoretic operations", "the set of all continuous operations on a topological space", etc.. These specific concepts are looked at in group theory, lattice theory, topology, etc.. General Algebra (or Universal Algebra) is the field of mathematics where one abstracts from these specific situations and looks at what one can say about such things in general; so from that point of view, the concept of clone "belongs to" General Algebra. People in other fields may or may not find it valuable to put the results they obtain about the objects that they look at in such a more general context; insofar as they do, they will find the language of general algebra useful. Some theoretical computer scientists have been happy to adopt concepts from Category Theory and General Algebra, including that of a clone of operations. Whether the use they make of these is in fact valuable, I don't know. ---------------------------------------------------------------------- You ask about many-sorted algebras, mentioned in the third paragraph after (9.9.4). These are very much like 1-sorted algebras. Instead of an underlying set, one has an S-tuple of underlying sets, where S is the set of "sorts"; and the arity of each operation is a list (with repetitions allowed) of the "sorts" of the arguments, and a specification of the "sort" of the output. When one defines a free algebra, instead of having a single free generating set, one has an S-tuple of generating sets, so that the algebra is free on a family of generators of specified sorts. One of the most natural examples is given by graded rings. Such an object (graded, let us say, by the natural numbers, for simplicity) is usually described in ring theory as a ring R that is given with a direct sum decomposition R = \sum_i R_i, the summands R_i being called the "homogeneous component of degree i", subject to the condition that any product of an element of R_i and an element of R_j is an element of R_{i+j}. This definition is adequate, but it is really most natural to regard the graded ring as a system of abelian groups R_i with multiplication maps R_i x R_j --> R_{i+j} satisfying appropriate identities. In ordinary one-sorted algebras, people can (although I don't like to) exclude the empty algebra, and make ad hoc definitions to get around the difficulties this produces, without losing a nontrivial amount of information about the theory of such algebras. But in a many-sorted algebra, any subset of the set of sorts may be empty, so that requiring that every sort should be nonempty loses a nontrivial amount of information. (Though this is not illustrated by graded rings, since the additive structure of each R_i leads to a zeroary operation with output "0_i" in each R_i). There is an article about some of the resulting complications in the theory, "The point of the empty set" by Michael Barr, Cahiers Topologie GĂ©om. DiffĂ©rentielle 13 (1972), 357-368, MR 48 #2216, though I haven't read it. (From the MR review, it requires the theory of "tripleability", which we haven't covered.) ---------------------------------------------------------------------- You ask about the two definitions of hyperidentity -- the one given before Exercise 9.9:14, and the one mentioned parenthetically before Exercise 9.9:15. These are very different concepts; it is unfortunate that they are given the same name. I find the definition in which the identities are assumed only for the primitive operations an unnatural one: the choice of which operations to regard as the "primitive" ones is just a matter of convenience when defining a mathematical concept. (E.g., one _could_ define "group" using only the operations (x,y) |-> x y^{-1} and e, and then that operation would become primitive while multiplication would no longer be.) So to require that "all primitive operations" satisfy some identities seems very arbitrary. > ... as far as I can tell, the distinction does not matter in > Exercise 9.9:14. It does. If we replace (a) by the statement that all primitive unary operations are equal, this would not imply the same for all derived unary operations. For instance in Group, there is only one primitive unary operation (the operation of inverse), so all primitive unary operations are equal; but the derived unary operation x . x is not equal to the one primitive unary operation. What you may have noticed is that the significance of (b) is unchanged if we replace "primitive" by "derived". That is true precisely because (b) is equivalent to condition (a), which is a hyperidentity in the sense I use. ---------------------------------------------------------------------- Regarding the last paragraph of section 9.9, you ask > Does this construction correspond in any way to an enriched structure > on the category C? I think that a functor \box of the indicated sort is the kind of thing one uses in defining a "category over C", i.e., a category D whose morphism-sets are objects of C, and whose composition operations are morphisms D(Y,Z) \box D(X,Y) --> D(X,Z). So it's not C that is an enriched category, but the resulting categories D. However, the assumptions one has to make on \box for this discussion may be weaker than those needed to define enriched categories. I don't know much about operads -- I've just seen the concept sketched, and in these paragraphs, I'm pointing the interested reader to something that he or she might want to look into. ---------------------------------------------------------------------- Concerning the second line after (10.1.5), you write > ... you say that $x, y \in |SL(n, A)|$ are the images of associated > universal element $r\in |SL(n, R)|$ under homomorphisms > $f, g : R --> A$. You mean V(f)(r) = x and V(g)(r) = y? Strictly speaking, yes. However, here I am writing informally, and given a homomorphism f of rings, and a matrix M over the domain of f, one can speak of "applying f to M," meaning that one applies it entrywise. Hmm, maybe if I change "under" to "via" it would at least make clear that the reader has to think about how x and y are obtained from f and g. ---------------------------------------------------------------------- Regarding the concept of cogroup, sketched at the end of section 10.1, you ask > Do such things as cogroups arise naturally in other contexts than > looking at the existence of an adjoint? This just depends on what one considers a "good motivation". I find the question "which functors have adjoints?" to be of great interest, so I motivate coalgebras in these terms. Alternatively, one can say "Many of the most basic constructions of algebra, when regarded as set-valued functors, turn out to be representable. Yet they also often have algebra structures (e.g., the construction SL(n)). How does such algebra structure arise?" As sketched in section 10.1, it arises from a coalgebra structure on the representing objects. ---------------------------------------------------------------------- Regarding the development of V-algebra objects of a general category, i.e., objects analogous to systems of algebras defined by operations and identities, in section 10.2 (and also the fact that we developed the theory of varieties, and not of more general classes of algebras, in earlier sections), you ask > ... Why do we stop at identities -- why not proceed to arbitrary > predicate calculus statements, then to arbitrary \Pi_1 statements ... Two reasons. The first was illustrated by the exercise in Chapter 2 showing that there do not exist free fields, and the exercise in Chapter 4 to show that a commutative ring R does not, in general, have a universal homomorphism to an integral domain. We are looking for constructions with universal properties, and classes of models of arbitrary sentences in the predicate calculus (e.g., the sentences defining fields, and integral domains) do not typically admit these constructions. The class of models of a family of universally quantified equations, i.e., identities, does. Secondly, not every sort of sentence in the predicate calculus has a reasonable category-theoretic translation. There are ways to remedy each of these problems. One can investigate what sorts of sentences, other than identities, do yield classes of models allowing universal constructions, and develop the appropriate topics -- yielding "quasivarieties", "prevarieties", and related concepts. And one can look for special classes of categories for which one can define the analogs of models of general sentences, leading into the theory of "topoi". The first set of concepts would allow us to generalize the material of this chapter; and if I ever find time to write further chapters, I intend to introduce it; but they wouldn't be able to fit into the scope of a 1-semester course (unless we left out a lot of other things we've done). On the other hand, as to using a topos as the category in which we define our algebra objects, this would be very restrictive. E.g., the opposite of the category of commutative rings is not a topos, so our expression of SL(n,-) in terms of a group object in that category could not be expressed in this context. If we wanted a context in which we could do these things, it would have to be one which would exclude most of what I center this course around. This could be a supplementary topic for a continuation of this course, but not one that subsumes the what we have been doing. Incidentally, I use the concepts of quasivariety and prevariety in a recent preprint, http://math.berkeley.edu/~gbergman/papers/pv_cP.pdf , for which I recall the definitions (in section 2); you might find that paper interesting to look at. ---------------------------------------------------------------------- Regarding the concept of algebra object in a category (Definition 10.2.4), you ask how we will be using it, saying > ... Even though I read the whole section, I am not grasping the point > of having this concept...I feel like it is going back and forth. Perhaps the following will help. First think of algebra objects, not in terms of the question "How will we be using them?", but as an answer to the question, "How can we take the concept of an algebra, which is defined as an object of the category Set together with certain additional structure, and get a generalization with Set replaced by a more general category C?" The answer is quite straightforward: In the usual concept of algebra, we have a set |A| with some maps |A| x ... x |A| --> |A|, which we call operations; so for the modified concept, we will assume the category C has finite products, take an object of C which we will call |A|, and define "operations" to be morphisms |A| x ... x |A| --> |A| in C. As for the material going "back and forth"; what you have to see is why it does so. After we define the concept of an algebra object in a category C, we find that it leads to a family of algebra objects in the usual sense: Just as every object X of a category C allows us to create a large family of sets, namely the hom-sets C(Y,X) for the different objects Y, so we find that an "algebra A in C" leads to a family of "algebras in Set", i.e., algebras in the traditional sense, namely, the sets C(Y,|A|) with algebra structures induced by the algebra structure of A, via the universal property of direct products. But the algebra object A in C is not these algebras; they can be thought of as its "shadows" in the category Set. However, we can study it using these "shadows"; in particular, we prove that A will satisfy the diagrammatic conditions corresponding to any identities if and only if these "shadows" satisfy those identities themselves. ---------------------------------------------------------------------- Regarding the concept of an algebra object in a general category (section 10.2), you ask whether one can similar define a "poset object", a "topological space object", etc.. I'll just address the case of a poset object. A difficulty in defining such a struct in a general category is that a binary relation on a set S is a subset of of the product S\times S, and there is no canonical choice for what a "subobject" of an object of a category should be. One can make definitions based on various choices -- taking subobjects to mean domains of monomorphisms, or equalizer objects -- and see whether any of these have nice properties. Or one could see whether one can characterize the "structure" (in an unknown sense) that an object R of a category C needs to have to determine poset structures on all objects C(R,X) in a functorial way. I just tried Googling "partially ordered object" category and the results all seemed to be about toposes or cartesian-closed categories. I haven't studied these, but I know that they are classes of categories that behave much more like Set than most categories do. This suggests that no one has found a good way to generalize the concept of partial ordering to objects of a "typical" category. ---------------------------------------------------------------------- Concerning Definition 10.3.5, you write > ... I am almost sure this representability is not equivalent to > the representability of section 8. It's a generalization thereof. Part (ii) of definition 10.3.5 says in effect "the functor must have the property that, if you forget the operations, it gives a representable functor in the sense of chapter 8". If V = Set, which can indeed occur, since sets are just algebras with no operations, then there are no operations to forget, and part (ii) then says that in this case -- the case to which the definition of chapter 7 applies -- the definition agrees with that of the chapter. (This is like the question "does the definition of multiplication of complex numbers conflict with the definition of multiplication of real numbers?" No, because when restricted to real numbers, it gives the same operation.) ---------------------------------------------------------------------- You ask why the assumption that C is a variety of algebras is needed for condition (iii) of Definition 10.3.5 to be equivalent to the other two. If C is an arbitrary category with appropriate direct products as in the first sentence of the definition, then "elements of A" and "relations" satisfied by such elements aren't meaningful. One might, of course, ask whether some weaker assumptions can be made which would make those concepts meaningful. One can certainly do that. If the conditions are too weak, the concepts might be meaningful but the the equivalence could fail. For instance, if C is the category of all finite groups, one can still speak of elements and relations, but a given system of elements and relations might not determine a finite group so such X and Y might not determine a representing object. (E.g., one element and no relations give in the variety of all groups the infinite cyclic group, but don't determine any object of the category of finite groups.) There are, however, conditions weaker than that C be a variety of algebras, but strong enough to make the equivalence hold. But to develop these would require that we introduce further concepts, and for simplicity, I have restricted the main focus of the text to varieties of algebras. ---------------------------------------------------------------------- Concerning Definition 10.3.5, you write > I am not sure why Rep(C,V) is a full subcategory. To define a subcategory Y of a category X, one must specify its objects and its morphisms. If one begins by specifying the objects, and also says it is to be a "full subcategory", this means that for two objects that belong to Y, the morphisms between them in Y are to be all the morphisms they have between them in X. Since the objects of V^C are functors, when I say in the last sentence of Definition 10.3.5 that Rep(C,V) "consists of" the representable functors, I am specifying the objects of the subcategory. Saying it is a full subcategory says that the morphisms between two such functors in Rep(C,V) are defined to be all the morphisms they have between them in V^C. (Did you understand the definition of "full subcategory" when you asked this question?) ---------------------------------------------------------------------- Regarding the diagram in the proof of Theorem 10.4.3, you ask whether it is usually easy to obtain a concrete form for G(A). There's no "usually"! It depends on the categories involved. (After all, these left adjoints are examples of universal constructions, and we saw in Chapter 4 that general universal constructions in groups, such as the description of groups presented by given generators and relations, range from very easy to very hard.) ---------------------------------------------------------------------- You ask whether Freyd's result, Theorem 10.4.9, is mostly used with C a variety, as in our initial example of SL(n). Within algebra, certainly. There are other categories of algebras with small colimits; for instance, those defined by "Horn sentences", i.e., implications such as "x^2 = e => x = e" in groups. (If we take the implications "x^n = e => x = e" for all positive integers n, we get the category of torsion-free groups.) But varieties are more often studied. Outside of algebra, I don't know. > Question 1: In practice, how difficult is showing that a given > functor is representable in the sense of Definition 10.3.4? For C a variety of algebras, it's usually pretty clear when the hypotheses apply. Showing that a functor is not representable can be trickier; but since we have just seen that representability as an algebra-valued functor is equivalent to representability of the set-valued functor gotten by passing to underlying sets, Proposition 8.10.4 tells us what we should check. Again, outside algebra, I can't say. ---------------------------------------------------------------------- Concerning the beginning of section 10.6 you write: > ... Can we think of the multiplication in Monoid to be derived > from the comultiplication? In other words, can we think of > comultiplication to come before multiplication? It will be easier to talk about the SL(n) example, because there the two varieties, CommRing^1 and Group, are different, so when I talk about "the ring operations" and "the group operations", you will know whether I mean the operations of the domain or the codomain of the representable functor. In the SL(n) construction, the varieties CommRing^1 and Group are defined first -- the definitions of those varieties use only operations, as in all the preceding chapters of the book; no co-operations. Given those two varieties, one considers a functor from the first to the second. This is a construction that takes for input any commutative ring A, and produces for output a group SL(n,A). The operations of the group that we construct are defined using the operations of the ring, and the way that this is done is encoded in the co-operation on the representing object. Our investigation of representable functors Monoid --> Monoid is similar, except that instead of being given a known construction like SL(n), and finding a way of encoding it using co-operations, we are investigating all possible functors Monoid --> Monoid that can be encoded in this way, i.e., that are representable. > ... Even with the example of SL(n), I am not grasping > what is meant by representing multiplication. ... Well, let's go back a few steps. Make sure that you understand the following points: --> If V: C --> Set is a representable functor, with representing object R, then for any object A of C, elements of V(A) correspond to homomorphisms R --> A. --> In the above situation, R has a universal element of V of it -- an element of V(R) that can be sent to each element of each set V(A) by a unique morphism R --> A. --> In the above situation, R \coprod R likewise has a universal ordered pair of elements of V of it. We then see that: --> If there is a binary operation which we can define in a functorial way on the objects V(A) (A\in Ob(C)), then by applying it to the universal ordered pair of elements of V(R\coprod R), we get what can be considered a universal instance of that operation. It will be an element of V(R\coprod R), hence will correspond to a morphism R --> R\coprod R. That is the co-operation. Using the universal property of R\coprod R, it determines the operation on all objects V(A). - - - The question of "which comes first" really depends on the situation one chooses to look at. In the SL(n) situation, we first knew how to multiply such matrices, and then translated this into a co-operation. In the present study of functors Monoid --> Monoid, we don't know, at the outset, what representable functors exist, so we are starting with the properties that a representing monoid and a comultiplication and co-neutral-element must have if it is to determine such a functor. In any case, note that the operations of the representing object R must be defined before we can speak of co-operations, since a co-operation is a map to a coproduct, and the structure of a coproduct of copies of R depends on the algebra structure of R. - - - Unless this clears the problem up completely, I suggest coming to office hours to discuss it. ---------------------------------------------------------------------- Regarding Theorem 10.6.20 you ask, > How do we know that multiple E-systems don't correspond to the same > representable functor from monoids to monoids? That's implied by Exercise 10.6:2. The unit of the adjunction is the map taking each E-system X to PQ(X). If two nonisomorphic E-systems X and X' had Q(X) isomorphic to Q(X'), then PQ(X) would be isomorphic to PQ(X'); but the by exercise, they are isomorphic to X and X' respectively, hence not to each other. Intuitively, the result says that one can recover the structure of X from P(X), namely by applying the functor Q. (Theorem 10.6.20 is not very clearly stated; I have a notation to rewrite it.) ---------------------------------------------------------------------- Regarding the last line of Theorem 10.6.20, you ask > ... What is meant by the word "equivalence" ... See Definition 7.9.5. > ... and why isn't it italicized? In an italic passage, de-italicization is used to show emphasis, just as italicization is used in non-italic text. I'm not entirely happy with that convention, since to my eyes, de-italicization doesn't make a word stand out the way italicization does, and doesn't give the same "feeling" of emphasis. But the convention is standard, and I follow it. I am emphasizing the word "equivalence" in this theorem because it conveys the "punch" of the result: that E-systems give all the information one could ask for about representable functors from monoids to monoids: What the distinct structures are, and how they can be mapped to each other. ---------------------------------------------------------------------- > Why is the natural correspondence between isomorphism classes > as mentioned in thm 10.6.20 contravariant? It goes back to the Yoneda Lemma, and the point discussed in Remark 8.2.8: Because the hom bifunctor of a category C is covariant in one variable and contravariant in the other, the functor taking each object to the *covariant* hom functor that it induces is *contravariant*. In the present chapter, this comes up in the form: the functor taking a coalgebra object to the covariant algebra-valued functor it represents is contravariant; so for given C and V, the category of co-V-algebras in C is equivalent to the *opposite* of Rep(C,V). This was Corollary 10.3.6. Since the category of E-systems is equivalent to the category of co-monoids in Monoid, it is equivalent to the opposite of Rep(Monoid,Monoid). ---------------------------------------------------------------------- Regarding the first two diagrams-with-dots on the page following Theorem 10.6.20, you write > ... I don't see how it is decided that the E-system represented by > the first boxes represents the identity functor and the second boxes > represents the opposite monoid functor. Why can't the first one > represent the opposite monoid functor and the second one the identity > functor? In the second sentence of the paragraph preceding these boxes, note the word "respectively". Make sure you understand what it means, and what consequences it has for the two coalgebras you ask about. Then look at the functors those two coalgebras represent. If you have trouble at some point in this path I've outlined, tell me where, and I'll help you from there. > I have a problem with seeing what difference first coalgebra having > the comultiplication m(x)=x^{rho} x^{lambda} for all x, and the > next one having the comultiplication m(x)=x^{lambda}x^{rho} > for all x, make to the respective functors. ... It's not "for all x"! It's for the one element x\in ||R|| that has degree 2! The other elements x^n have higher degree, and are mapped by the comultiplication to correspondingly more complicated expressions. > ... Do you mind pointing out where I should re-read to understand > this? I guess the best place to start is Definition 10.3.1. The second paragraph of that definition describes the operations on the functor represented by a coalgebra object. It begins by saying that these are induced "under the dual of the construction of the preceding section", but then gives an explicit description of that construction. This Definition is written from the point of view of going from the co-operation to the operation, but that should not be a problem: Take the case where |R| is the free monoid on one generator x, note how to identify the set-valued functor represented by |R| with the underlying-set functor on monoids, form the coproduct of two copies of |R|, call the generator of the first "x^\lambda" and the generator of the second "x^\rho", then consider two different co-operations |R| --> |R|^\lambda \coprod |R|^\rho, namely, m_1 taking x to x^\lambda x^\rho, and m_2 taking x to x^\rho x^\lambda. (And note, as I emphasized above, that these do not take every element r of |R| to r^\lambda r^\rho or r^\rho r^\lambda. You should see what they do to other elements.) Then apply Definition 10.3.1 to find the binary operations on the underlying-set functor Monoid -> Set induced by those two co-operations. Hopefully, you will find that one of them is the original operation of the monoids to which the functor is applied, while the other is the opposite multiplication. If you get stuck, come to office hours and go through it with me. (Or if the problem is one that can easily be described in e-mail, you can e-mail it to me.) Once you see these examples, you will, hopefully, understand how, given an operation on a representable set-valued functor, one can go the other way, and find the co-operation on the representing algebra that induces it. Let me know how this goes. If I can identify the roadblocks that keep students from understanding this material, I can hope to get it across better in the future! ---------------------------------------------------------------------- You ask about "a nice description" of the left adjoints of the representable functors Monoid --> Monoid (section 10.6). Well, they all have the nice formal description as functors associating to each monoid M a monoid gotten by attaching together a bunch of copies of the representing monoid R, indexed by the elements of M, with relations determined by the relations of M and the comultiplication of R. But what the result looks like for particular comonoids R can be complicated. For the simplest interesting case: The left adjoint of the functor associating to a monoid its group of invertible elements is the functor associating to a monoid M the result of adjoining an inverse to every element. The next-simplest interesting case is the left adjoint of the functor associating to M the monoid of pairs (a,b) in M with ab=e. This adjoint adjoins to M, for every a\in M, an element a' such that aa'=1, in such a way that the map a |-> a' reverses the order of multiplication. Note that if in M one has ab=e, then in the new monoid, b will have both the left inverse a and the right inverse b', so it becomes invertible, so its left inverse a also becomes invertible. So the elements that were 1-sided invertible M all become invertible in the new monoid; hence the submonoid generated by those previously-1-sided-invertible elements is embedded in a group. Elements that were not previously 1-sided invertible need not become invertible, though they do become 1-sided invertible. I haven't studied those functors systematically ... . ---------------------------------------------------------------------- Regarding the results of section 10.6, you ask, > ... We just described representable functors from MONOID > into itself. Does this shed any light on representable functors > from RING^1 into itself? (I'm asking because rings are monoids > with extra structure.) The problem with that approach is that determining representable functors from Monoid to Monoid comes down to saying "These are the only ways one can obtain an associative operation on (appropriate sorts of) tuples of elements of a monoid, using the monoid operation alone"; but when we are looking at functors on Ring^1, we aren't limited to using "the monoid structure alone". Generally, if we are looking at functors out of a given variety W, then restrictions on the functors we can get into one variety V will also give us information about restrictions on functors from W to other varieties V' that in some sense "have a V-structure and more", but won't give restrictions on functors from varieties W' that have "W-structure and more"; inversely, existence results for representable functors V --> W will give existence results on such functors V' --> W when V' has "a V-structure and more", but will not give existence results on functors V --> W' where W' has "a W-structure and more". A description of all representable functors Ring^1 --> Ring^1 is, however, obtained in my book with my first PhD student, Adam Hausknecht, reference [2]. ---------------------------------------------------------------------- You ask about the meaning of "subobject" in the paragraph following display (10.7.1). Good point. I guess I was implicitly relying on the fact that in most real-world mathematical contexts, an idempotent endomorphism of an object is a retraction to a subobject. This isn't a formal statement true or even meaningful in an arbitrary category; so that comment should be considered a heuristic observation, suggesting what we should look for. It is valid with object taken to mean "category", so it leads us to the right conclusion in this case. ---------------------------------------------------------------------- You ask about a generalization of Exercise 10.7:1(v). I hope you would want to include (iii) and (iv) along with (v), since they all show the same pattern. In very general form, the pattern is that when one has a retraction of a mathematical object X to an object Y, meaning a map f: X --> Y which has a right inverse g: Y --> X, i.e., such that fg is the identity morphism of Y, though gf may not be the identity of X -- see paragraph containing (7.7.3) -- then maps from any object Z to Y correspond to maps to h: Z --> X such that h = gfh; and likewise maps Y --> Z correspond to maps i: X --> Z such that i = igf. Namely, the map h: Z --> X corresponds to fh: Z --> Y, and the map i: X --> Z corresponds to ig: Y --> Z. You can verify that this gives bijections of sets of maps in each case. The situations occurring in the exercise are a little more complicated in that the composites FU and UG are not quite the identity functors of the categories in question, but isomorphic to the identity functors. So one has to make a little adjustment (noted parenthetically at the end of 10.7:1(iii), and then taken for granted in the remaining parts). ---------------------------------------------------------------------- You ask whether our classification of representable functions K-Mod --> L-Mod in section 10.8 requires that K and L have unity. The classification could be carried out either with or without that assumption. (In the latter case, of course, we would leave out (10.8.6) and (10.8.11).) But the category of modules over an object K of the category Ring is equivalent to the category of (unital) modules over the object K^1 of Ring^1, where K^1 is the ring whose underlying additive group is the direct sum of the additive groups of Z and of K, and whose multiplication is defined in a way that uses the multiplication of K on pairs of elements of K, and makes the 1 of Z the multiplicative neutral element. Since rings of the form K^1 are far from all rings (e.g., no field has that form), the categories we get by considering nonunital base ring are more restricted than those we get by studying unital modules over unital base rings. So it seems best to study the unital case, and obtain results for nonunital rings, whenever one needs them, as corollaries gotten by applying the unital results over the rings K^1, L^1. ---------------------------------------------------------------------- You ask whether the concept of tensor product was motivated by the considerations of section 10.9. I think that the tensor product construction originated in physics, where it was realized that the concept of "tension" in a solid -- the expression for the forces acting to stretch and compress the material -- could be expressed as a member of a vector space, and that this space had more dimensions than the 3-dimensional space R^3 in which the material lived, but that it was closely connected to R^3, since every rotation of an object in R^3 induced a corresponding transformation on the expression for tension. It was finally worked out that it was a space generated by the image of a bilinear map R^3 x R^3 --> V, and various spaces with such multilinear maps were called "spaces of tensors"; and eventually a universal space S with a multilinear map V x W --> S was called "the tensor product of V and W". (Originally, for vector spaces V and W; later for more general modules and bimodules.) This is just the impression I've picked up; I've never studied the history of the subject. But I'm sure that the relation with composition of representable functors was realized much much later. ---------------------------------------------------------------------- Regarding the constructions C^pt and C^aug (Definition 10.10.1), you write > For a given category, we need not have a way to turn any object into > a pointed object, as there need not be morphisms from the terminal > object to every object, but all objects can be made augmented, right? Nope. First, if we have an object X of C that doesn't admit a morphism from the terminal object, then the corresponding object of C^op won't admit a morphism to the initial object (since a morphism from it to the initial object in C^op is just a morphism from the terminal object of C to it). But we don't have to go to such cases to get examples. In the category CommRing^1, no ring that contains a field can be augmented. (I.e., it can't have a homomorphism to Z.) ---------------------------------------------------------------------- Regarding the statement following Lemma 10.10.2 that when k has nontrivial automorphisms and/or idempotent elements, the automorphism class group of the variety of k-algebras has a more complicated structure, you note that you see how automorphisms can be used, but not idempotents. If k = k_1 x k_2, then every k-algebra R can be written R_1 x R_2, where R_1 is a k_1-algebra and R_2 is a k_2-algebra. Hence we can construct the functor R_1 x R_2 |-> R_1 x R_2^{op}. These constructions alone give a group isomorphic to the additive group of the Boolean ring of idempotents of k. Combining these with automorphisms of k, which in general permute the idempotents, and hence induce automorphisms of the above group, we get a semidirect product of the two groups. ---------------------------------------------------------------------- > Just to check my understanding: the results we state in 10.12 > about contravariant right adjunctions also work for contravariant left > adjunctions, but we're only stating one version because contravariant > left adjunctions are so rare compared to left adjunctions, right? No. If you look at Definition 8.12.1, you will see that in the case of contravariant right adjunctions, if C is a variety and we insert in place of the dash in (8.12.2) the free object F_C(1) on one generator in C, then the left-hand side gives the underlying set of V(~). Combining with the right-hand side now shows that the resulting contravariant set-valued functor is representable, namely by U(F_C(1)). Hence the contravariant C-valued functor V is representable by a C-algebra structure on the object U(F_C(1)) of D. But if you turn to the contravariant left adjunction case, shown by (8.12.3), if C or D is a variety, there is in general no way of rendering either side of that formula as a description of the underlying set of U or V as a representable functor. (If C or D is the opposite of a variety of algebras, one can get such descriptions; but this just translates us to the covariant-adjunction case.) > Assuming an affirmative answer to the first question: Why are > contravariant left adjunctions so rare? I'll give you a reprint of my paper. ---------------------------------------------------------------------- > Are there any examples of functors T: C^\op \to C such that T and > TT have left and right adjoints but TTT does not? I don't know. One conceivable way to approach the question would be to have C be a direct product of, say, 4 categories C_0 x C_1 x C_2 x C_3, and have T built out of functors C_i --> C_{i+1} (i=0,1,2), such that some sort of bad behavior is reached only when we carry C_0 into C_3. I don't have anything detailed in mind, but you might be able to go somewhere with the idea. ---------------------------------------------------------------------- You ask about "the intuition for deriving representable functors $V^op --> W$ from $V \bigcirc W$" (section 10.13). The idea is that given an object $R$ of $V \bigcirc W$", it has both a $V$ structure and a $W$ structure. Because of the former, one can associate to every object $A$ of $V$ the set of homomorphism $V(A,R)$, and because of the latter, one can apply any n-ary operation of $W$ pointwise to these V-homomorphisms. Finally, the "commutativity" relations have the consequence that the result of applying a W-operation pointwise to a tuple of V-homomorphisms is again a V-homomorphism, making our set of V-homomorphisms a W-object. As noted in class, duality of vector spaces is an example (with V=W). ----------------------------------------------------------------------