ANSWERS TO QUESTIONS ASKED BY STUDENTS in Math 245, taught from my notes "An Invitation to General Algebra and Universal Constructions", http://math.berkeley.edu/~gbergman/245, Fall 2011 and Spring 2008. ---------------------------------------------------------------------- You ask whether, in Exercise 1.2:2, p.12, when I speak of groups yielding the same pair, G'_1 = G'_2, I mean isomorphic pairs. No; by equality I mean equality! If you have proved a result about isomorphism, you can submit that as homework -- investigations of questions of one's own devising are accepted as homework, if they are relevant to the course -- but the question asked was about genuine equality. ---------------------------------------------------------------------- Regarding the construction of group-theoretic terms in pp.14-15, you ask whether we need to consider $X$ of uncountable cardinality. That depends on what purpose we will be using our terms for. If we want to use them to write down identities, then the countable case is enough. (We'll show that explicitly, toward the end of the course, for arbitrary sorts of algebras. It is the second paragraph of Lemma 8.4.2, p.289, in the case where \gamma = \aleph_0.) On the other hand, if we have some explicit uncountable group G (for example, the group of all bijections from a countable set S to itself), and we want to reason about all relations satisfied by an uncountable set X of elements of G, then we need to use terms in that uncountable set. ---------------------------------------------------------------------- Regarding my statement on p.17, in the last line of section 1.6, that when one forms the set T of group-theoretic terms in \{x,y,z\}, the term "y" represents "the ternary second-component function", you point out that as a set, \{x, y, z\} doesn't specify that y is in "second position". Good point. I agree that writing X = \{x, y, z\} in no way puts an ordering on x, y, z. I guess I was thinking of the symbols x, y and z as symbolically meaning "first variable", "second variable" and "third variable", as though they were written x_1, x_2 and x_3. I'll have to think about what to do here in the next revision. Thanks for pointing it out! ---------------------------------------------------------------------- You ask what is meant by a "predicate", where I say on p.23 that if we think of relations as predicates, then the "intersection" operator on them becomes the logical operator "and". A "predicate" means, roughly, an assertion, with some set of blanks to be filled in. In grammar, when one divides the sentence "The man is happy" into "subject and predicate", one calls "the man" the subject, and "is happy" the predicate; and that predicate can be combined with other subjects to give other sentences. So "predicate" is extended by logicians to mathematics, where conditions such as "=" and "isomorphic to", can be called binary predicates, "is positive" (among real numbers) and "is prime" (among natural numbers) can be called unary predicates, etc.. A relation can be formalized as a set of pairs, or thought of as a predicate, and the notation varies accordingly. ---------------------------------------------------------------------- You ask for references to how the set-theoretic difficulties with "generalized operations" raised at the top of p.29 can be resolved Actually, they can be resolved using the ideas of section 6.4 of these notes. I just didn't want to point the reader in section 2.3 to something that he or she might find hard to follow at this point. ---------------------------------------------------------------------- Regarding the construction of the quotient group at the end of section 3.1, p.36, you ask whether there is some other way of constructing that group either "from above" or "from below" ... When we find the normal subgroup N generated by a set S of elements of G, we can do this either "from above" or "from below", as discussed in that section. But once we have found it, all we have to do to impose the corresponding relations on G is to divide out by it. So the "from above / from below" distinction somes in at an earlier point in this construction. ---------------------------------------------------------------------- You ask what is meant by "normal form" on p.42. Did you check out the phrase in the index, and re-read the paragraph where it is defined (the boldface number in the index entry)? ---------------------------------------------------------------------- You asked, in connection with exercise 3.4.7, p.45, "Is there any reason not to consider a free solvable group on a set X?" Well, that is essentially what the last sentence of the exercise is asking _you_! So you need to look at the construction of free groups and the other universal objects introduced so far, and also look at the concept of "solvable group", and see whether you can adapt the general methods of constructing free objects to that concept. In trying to do so, you need to look at the differences between the concepts of "group" and of "solvable group", and see whether those difference pose obstructions to one or another of the methods of constructing free objects; if you find that one of those methods goes over with no changes at all, point this out. If all of them run into difficulties, note what those difficulties are, and see whether you can overcome them in at least one case. If you can't, then try to prove a non-existence result. Have you attempted any of this? If so, what were your results? ---------------------------------------------------------------------- You ask about the reason for the term "residually finite", introduced near the top of p.47. Well, in certain contexts, a factor object of a mathematical object is said to consist of "residues". I guess this goes back to the ring Z/nZ, where people often think of its set of elements as \{0, 1, ..., n-1\}, i.e., the residues that one gets on dividing arbitrary elements of Z by n. The most common use of the term I am aware of in algebra is when R is a local ring with maximal ideal p; then R/p is called "the residue field" of R. Perhaps in group theory G/N was at some time called the "residue group" of G on dividing by N. Then if G can be thought of "living in" the direct product of its finite residue groups, it is reasonable to call it "residually finite". Other words can be substituted for "finite". E.g., if the elements of a group can be distinguished by homomorphisms into solvable groups, then it is "residually solvable". I'm so used to this use of "residually" that I hadn't even thought of where it came from. ---------------------------------------------------------------------- You ask why there are different symbols for direct sums and direct product of abelian groups, even though, as noted near the top of p.53, they are the same construction. I think it's mainly historical. The symbol A (+) B developed within group theory and module theory, as representing an abelian group or module in which every element was uniquely representable as a sum a + b with a\in A, b\in B. The symbol A x B developed within set theory, probably based on the fact that when A and B are finite sets, with A having m elements and B having n elements, then A x B is a set with m n elements. The concept "A x B" spread to all areas of mathematics, since the concept of direct product is important everywhere (e.g., if R is the real line, then R x R is the plane). In particular, if for A and B are two rings, or (not necessarily commutative) groups, then A x B has a natural structure of ring or group. In afterthought, one sees that when A and B are abelian groups, this construction A x B is isomorphic to the construction A (+) B that people had been using all along. So the result was two symbols for the same construction. But as mentioned in the notes, for infinite families, the corresponding constructions are distinct; another reason for using distinct symbols. ---------------------------------------------------------------------- You asked about the Remark following Exercise 3.9:4 on p.58, where I say that if we allow a ring R to have not necessarily commutative additive group structure, but still required multiplication to be linear in each variable, that exercise allows us to deduce that the addition would, nevertheless, be commutative. You asked how this follows. We use the fact that a ring is required to have a multiplicative neutral element, 1. On the one hand, this implies that every element x can be written as a product x . 1, hence the additive subgroup of R generated by all products is all of R; and conclusion (i) of the exercise tells us that that additive subgroup is commutative. On the other hand, part (ii) of the exercise says that a generalized bilinear map F x G --> H factors through the abelianizations of F and G. Now if the additive group of R were not abelian, there would be two distinct elements x, y whose images in the abelianization were the same. But by statement of part (ii) just quoted, x . 1 depends only on the image of x modulo abelianization; so x . 1 = y . 1, i.e., x = y, contradicting the assumption that x and y are distinct. (So the first of these arguments uses the surjectivity of multiplication by 1, while the second uses the one-one-ness.) If we considered rings R without 1, then strictly speaking, if we did not require commutative addition, we could get examples where the addition was indeed noncommutative. But if we think of the interest of a ring structure as lying in the combination of addition and multiplication, then we see that the noncommutativity of such "rings" would depend on an additive subgroup where the multiplication was zero, so one could say that this generalization would be "uninteresting". But that is just a subjective view; the result I was referring to in the book used the assumption that rings have 1. ---------------------------------------------------------------------- You asked, regarding the connection between the Weyl algebra and Quantum Mechanics mentioned on p.72, whether I could recommend any texts giving a mathematical treatment of Quantum Mechanics. I asked a few colleagues, and two books were suggested. One is an old book, "Mathematical Foundations of Quantum Mechanics" by George W. Mackey, now republished by Dover. (Dover republishes old out-of-copyright mathematical works at low prices -- a valuable service!) The other is "Introduction to Quantum Mechanics" by Hannabuss. I also learned that we have a course in the mathematics of quantum mechanics, Math 189. ---------------------------------------------------------------------- You asked about the treatment of Galois theory via tensor products, mentioned on page 74, after Exercise 3.13:4. There are two write-ups of courses Lenstra has taught on the subject; he describes them as "each having its own imperfections": http://websites.math.leidenuniv.nl/algebra/topics.pdf http://websites.math.leidenuniv.nl/algebra/Galoistheoryschemes.pdf The first is notes from a 250A he taught here, written up by him and two students. The second is from a course he gave long ago in Leiden (written up by an unknown person, and found on the web), which did the analog of Galois theory for rings more general than fields. He would be happy to learn of any errata you find in either of them. You also mention the site http://en.wikipedia.org/wiki/Grothendieck's_Galois_theory Unfortunately, at the moment Wikipedia seems to be down, so I can't check it out. ---------------------------------------------------------------------- You ask about the statement on p.77 that the idempotents in a commutative ring R correspond to the continuous {0,1}-valued functions on its spectrum. I assume below that you are familiar with basic algebraic geometry: On the one hand, suppose r\in R is idempotent. Then 0 = r - r^2 = r(1-r), so any prime ideal P contains either r or 1-r. Assume P contains r. Then 1-r, being 1 -(member of P), is not in P, so it is invertible mod P, so when one localizes at P one can cancel it from "r(1-r)=0", getting r = 0 in the localization. Likewise, if 1-r\in P, then in the localization, 1-r=0, i.e., r=1. So the continuous function on Spec R induced by r is everywhere {0,1}-valued. Inversely, if f is a continuous {0,1}-valued function on Spec R, then the subsets of Spec R on which it is 0, respectively 1, will be open-closed. Hence in the structure sheaf, the function agreeing with the global section 0\in R on the first of these sets and with 1\in R on the other, i.e., f itself, will be a global section, i.e., a member of R. ---------------------------------------------------------------------- Regarding the construction of coproducts of sets on p.78, you write > Let's say we have S=(S_0,S_1) we are forming the coproduct Q. If a,b > are in S_0, and if (a\cup b)=c, then we would expect the left > injection, as a homomorphism, to satisfy ... "Homomorphism" is a concept relevant to algebraic structures; it means map between underlying sets that refers to the sort of algebra in question. When we talk of a coproduct of sets, these are sets without any additional structure; the analog of a homomorphism is just a map of sets; so it is not expected to satisfy any other conditions. I guess you were thinking that a homomorphism of sets should respect "the kinds of things one studies in set theory", which includes things like the operation of taking unions of sets. That kind of structure is what we studied in the section on Boolean algebras. But the concept of a map of sets is simply that of a function, so the coproduct of a family of sets is simply a universal instance of a set with a function from each of those sets into it. ---------------------------------------------------------------------- You ask about the use of the phrase "generators and relations" with respect to sets, on p.78, bottom, asking how one can "generate" anything when one is not given any algebra-operations. We are carrying the phrase "generators and relations" over from the cases of groups, monoids, rings, etc., in order to show the parallism of the constructions. But in the case of sets, the situation is indeed degenerate, and nothing more is "generated" than the given elements of X. ---------------------------------------------------------------------- Regarding the construction of the set obtained from a set X by imposing relations given by R (p.78, bottom, and 79, top), you write > ... the relation will make certain elements of X equivalent and > then we just need to pick one of them as a representative. ... That is an old-fashioned way of looking at these constructions. It sometimes has advantages; but usually the nicer way is to let the set of equivalence classes itself be one's new set . The one "disadvantage" of that approach is the problem of visualizing as a "collection of collections". The way I tell my students in Math 113 to think of the result of "dividing" a set by an equivalence relation is that such a set consists of new elements, each of which arises by "gluing together" one or more elements of R. No mathematical difference; but picturing them as "stuck together" rather than "loose" may be less confusing. One way or the other, the idea is to have a set having a many-to-one relationship with X: different elements of X correspond to the same element of the new set if and only if they are in the same class under the equivalence relation. ---------------------------------------------------------------------- In connection with the examples at the top of p.80, you ask (explaining that you have not yet taken a topology course) what is meant by "closure". In general, the closure of a subset S of a topological space T means the set of points of T that are either in S, or are limit points of members of S. When the topology comes from a metric (distance function), something that you will have seen in Math 1AB and 53 (even though not under the names "topology" and "metric"), a limit point of S is simply a point which is the limit (in the sense of those courses) of a convergent sequence of points in S. So, for instance, when T is the real line, and S is an open interval (a,b), its closure is the closed interval [a,b]; while for the same T, and S the set of rational numbers, the closure of S is the set of all real numbers. This description of the closure is not valid for topological spaces that aren't metric spaces, as noted at the bottom of p.80, but it at least gives a start at picturing the concept. > ... How does this not hold in the example (3.17.1)? (3.17.1) consists of two examples, side-by-side; let's talk about the first. It represents the image of the real line under a continuous map into a compact rectangular region of the plane. The closure of that image consists of the image itself, the left endpoint of the wiggly line (which is not itself in the image), and, on the right side -- well, you can see that limits of sequences of points in that wiggly line will exist all up and down a vertical interval. So there is no way to extend the map R --> K to a continuous map R\cup\{+-\infty\} --> K: the point +\infty would have to simultaneously be mapped to all the points of that vertical interval. The other example is similar. ---------------------------------------------------------------------- Regarding the last full paragraph on p.85, you write, > You note that for a space to possess a universal covering space > it must in general be semi-locally simply connected. ... No; I note that this and the other conditions listed are the assumptions that are shown in [77] to be sufficient to make the construction described work; I don't say that a universal covering space can't exist if they don't hold. The author of [77] may even have known of more general hypotheses under which a universal covering space exists, but decided that it would be best to prove a result that applied to most "naturally arising" spaces, rather than go through a much messier argument to cover some additional pathological cases. In particular, it looks to me as though some spaces that do not satisfy the condition of being locally pathwise connected (which is also in the list), can have universal covering spaces. E.g., consider the subset of the plane consisting of the union of the line-segments y = cx with x\in [0,1], and c taking on the value 0 and all values 1/n for positive integers n. This X is contractible, hence simply connected, so it should be its own universal covering space; but it is not locally pathwise connected (that fails near any point (x,0) with x>0). However, it is indeed hard to see how a space that is not semi-locally simply connected could have a universal covering space. To see why, let us assume for simplicity that this condition fails in the neighborhood of the base-point x_0. Then there are non-contractible loops that stay arbitrarily close to that point. Suppose (p_i)_{i=1,2,...} is a sequence of such loops that stay closer and closer to x_0. Then the sequence p_i will approach the trivial loop, which stays at x_0. Hence by continuity of the map p\mapsto \~{p}, their liftings \~{p_i} should be approaching the constant base-point map in the universal covering space. But since they are non-contractible, they should end up in different "layers" of that covering space. If points in those different layers can approach the basepoint, this contradicts the definition of covering space, which forces the inverse image of every point of X to be discrete. However, for non-locally-pathwise-connected X, maybe it would be most natural to modify the definition of covering space, e.g., by replacing "discrete" with "totally disconnected". (Note: I haven't actually looked at algebraic topology since I was a student in the '60's; so the above comments are not based on reliable expertise.) ---------------------------------------------------------------------- In connection with the concept of an isotone, i.e., order-respecting map, defined on p.89, you ask whether there is a term for an order-reversing map. Yes. It's called an "antitone" map. Not too commonly used, but it exists. ---------------------------------------------------------------------- You suggest that the rule that associates the graph called the Hasse diagram to a finite poset (p.92) can be used for arbitrary infinite posets P as well. Unfortunately, the graph you get can lose a lot of information from an infinite P. Think of what you get from the poset of real (or rational) numbers! Can you figure out under what conditions on P that diagram will preserve all the order relations? ---------------------------------------------------------------------- You ask whether we can generalize the preorder "divides", referred to on p.95 as a relation on elements of commutative rings, to noncommutative rings. We can, once we decide what definition to use. One says that x "right divides" y if we can write y = ax, and that it "left divides" y if y = xb; each of these is a preorder; they correspond to the inclusion relations on the left ideals Ry, Rx, respectively the right ideals xR, yR, generated by our elements. One could similarly call x an "interior divisor" of y if y can be written axb, and this would also be a preorder, though I have never seen this considered in ring theory. (This relation is not equivalent to inclusion between the 2-sided ideals RxR and RyR, for the reason mentioned on p.71, the two sentences preceding Ex.3.12:2. The relation of inclusion between RxR and RyR would yield still another "divisibility-like" preorder on elements.) ---------------------------------------------------------------------- Regarding transfinite induction (a term not used in the notes), you ask whether this refers to induction, in the sense of Lemma 4.3.4 (p.100) over some ordinal. Right. > Also, I would enjoy seeing an example of some proposition that can be > proven via general or transfinite induction, but not via standard > induction on N. Exercise 4.3:5 proves a standard result on symmetric polynomials by induction over a well-ordered set which is isomorphic to an ordinal > \omega, so it can be thought of as a transfinite induction. There will be more examples in section 4.5, where we will study ordinals. (The definitions of ordinal arithmetic in (4.5.7-9) (p.113) and (4.5.10) (p.114) are by transfinite recursion, and so one proves results about these by transfinite induction; it is also used in proving Lemma 4.5.12(ii).) In the next chapter, it is used in proving Lemma 5.2.1; and still later, in section 8.2. ---------------------------------------------------------------------- > ... You mention, on p.101 in the para. after Exercise 4.3:3, > that some arguments for uniqueness of a differential equation > solution use connectedness. ... The simplest case is the equation y' = 0. On the real line, the set of solutions is the set of constants, while on a domain like (0,1) \cup (2,3), the solutions will be constant on each connected component, but can have different values on the two components, giving a 2-dimensional space of solutions instead of a 1-dimensional space. From this, one gets similar behavior for equations y' = f(x) for any continuous f(x); and one gets analogous results for higher order equations; though, depending on the nature of the equations, there may or may not be complications in the existence and uniqueness results other than those resulting from non-connectedness. But differential equations are far from my field; so I'm definitely no source of expert knowledge on the subject, or on how the experts look at it. ---------------------------------------------------------------------- You ask about my statement at the top of p.102 that because of the axiom of regularity, we can make set-theoretic constructions recursively, and whether without that axiom, we have to use methods other than recursion. What I meant was, "by recursion with respect to the membership relation", since regularity shows that this has DCC. Without regularity, one can still do recursion with respect to any index set having DCC, and this is still an important tool in set theory. ---------------------------------------------------------------------- You ask what I mean in the first paragraph of the proof of Prop.4.5.3, p.111, when I say that \alpha is "closed under \in". I mean the first condition of Definition 4.5.2: That for every member \beta of \alpha, all things \in\beta are also members of \alpha. ---------------------------------------------------------------------- You ask what S is in the third paragraph of the proof of Prop.4.5.3, p.111. At the beginning of the paragraph, where I have "a set of ordinals", make that "a set S of ordinals". Thanks for pointing it out! I'll fix it in the next printing. ---------------------------------------------------------------------- You ask, in connection with the description of ordinal arithmetic on p.113, whether \omega + \omega can be looked at as consisting of two segments, each having the structure of \omega under the successor operation, but such that the successor operation does not connect the two copies. Yes. In fact, every limit ordinal can be looked at as a disjoint union of (finitely or infinitely many) copies of \omega, each closed under the successor operation. ---------------------------------------------------------------------- Regarding the indexing of cardinals by ordinals, referred to at the bottom of p.116 and the top of p.117, you ask how this is possible, given that "all ordinals" and "all cardinals" do not form sets. The answer to this is similar to the point I mentioned in class, on how we can define "arithmetic operations" on ordinals, though the ordinals don't form a set. In the case you ask about, though we can't regard the indexing as a function from one set to another, we can say that for each ordinal \alpha, the symbol \aleph_\alpha denotes a cardinal uniquely determined by a certain property (namely, that the ordered set of cardinals smaller than it -- a genuine set -- be order-isomorphic to \alpha). So the cardinals \aleph_\alpha are each well-defined, even though the assignment \alpha |-> \aleph_\alpha is not itself a function in our set theory. ---------------------------------------------------------------------- You ask about the significance of singular cardinals (Def.4.5.17, p.117). When dealing with a cardinal \kappa, one can usually say that the union of a family of <\kappa sets, each of cardinality <\kappa, is itself of cardinality <\kappa. (For instance, taking \kappa = \aleph_0, this is the assertion that a finite union of finite sets is finite.) The singular cardinals \kappa are precisely the exceptions. Fortunately they are, as I mention, sparse, so if one wants to use that principal in the proof of a theorem, one can throw in the hypothesis "Let \kappa be a regular cardinal" and one hasn't lost much. As an example, if you look at p.2 of my paper with Shelah at http://math.berkeley.edu/~gbergman/papers/Sym_Omega:2.ps , you will see, among the people we thank, Peter Biryukov. We don't say there what we thank him for. It was he who pointed out to us that the assertion we make in the paragraph at the bottom of p.3 was not true as we originally formulated it, without the condition of regularity. ---------------------------------------------------------------------- In connection with Theorem 4.2.6, p.119, you ask whether, since every set is in bijective correspondence with an ordinal, this can be described as a universal property of ordinals. Not in any way that I can see. The word "universal" has various uses in math; for instance, "\forall" and "\exists" are respectively called the "universal" and the "existential" quantifiers; so there are very likely some statements using the word "universal" that follow from the above fact about ordinals. But what are called "universal properties" involve existence of unique maps. ---------------------------------------------------------------------- You ask why I emphasize "pointwise" in the third paragraph of p.127. Hard to remember exactly what was in my mind when I put in that emphasis. I guess the idea was that in a given set S of functions, two functions f and g may have a least upper bound, i..e, a least member h of S that is everywhere \geq f and \geq g, without our being able to say much about this h; and that the reader might carelessly think that this is all that "the maximum of f and g" referred to if they missed the word "pointwise". I tend to be a sloppy reader in such ways, and assume my audience is also likely to be. And in general, when a concept that hasn't come up previously in what one is doing is introduced, it is useful to bring it to the reader's attention, and not let it get passed over unnoticed. ---------------------------------------------------------------------- You ask why, as mentioned on p.127, some people write lattice operations using the symbols for addition and multiplication. I'm not sure, but I can make several guesses. I don't know when the symbols \vee and \wedge were introduced; there may have been a time when they were not common, and people simply tried to choose existing symbols with the closest meanings. "+" is natural for "putting things together"; moreover, for unions of subsets of a set, if we think of those subsets as represented by {0,1}-valued functions on the set, the union can be thought of as "addition as integers, with 1 made a ceiling"; while in most natural contexts, meets are intersections, which correspond to products of {0,1}-valued functions. Even after \vee and \wedge had been introduced, some people may have simply stuck with the symbols they had learned first. Also, for a long time we didn't have computers on which to compose mathematics, and typewriters generally didn't have special symbols, but did have +, while "xy" didn't require any symbol. Finally, some people may use "arithmetic" symbols because they feel it valuable to stress the analogy between lattices and rings; one can speak of "ideals" in a lattice, for instance (subsets closed under internal joins and meets with arbitrary elements). Even though these don't play the role of determining the structure of the image of homomorphisms, as they do in rings, they have some uses. ---------------------------------------------------------------------- You ask, in connection with the symbols "0" and "1" for the least and greatest element in a lattice having these, introduced on p.132, whether this is related to the fact that in a ring, 0 and 1 generate the smallest and largest ideals. The use of "0" and "1" is certainly related to the ring-theoretic and ideal-theoretic analogies; in particular, the case of Boolean rings; and that case is in turn related to the fact that in the set 2^X of subsets of a set X, identified with their characteristic functions, the empty set and the whole set correspond to the constant functions 0 and 1. ---------------------------------------------------------------------- You ask why the fact noted on p.136 that infinite meets and joins in a complete lattice are not operations of a fixed arity is a problem for complete lattices and not for <\alpha-complete lattices in general. For <\alpha-complete lattices, one can regard it as a problem, but a problem with an easy solution, noted in the middle of the paragraph in question: Regard such objects as having a set of "meet" and "join" operations, one for each arity <\alpha. The difference in the case of unrestricted complete lattices is that the resulting system of operations will not form a set, since there is not a set of all cardinals. Given a particular such complete lattice L, one can argue that one doesn't "need" operations of arities bigger than card(|L|). But in many situations one isn't _given_ L, one wants to construct an L, or say whether there is an L with a particular property, so one can't know in advance a certain set of operations that will be sufficient. And this is exactly what goes wrong in the matter that paragraph refers us to, Exercise 7.10:6(iii). You also suggest regarding meet and join as unary operations on P(|L|). Well, the theory of sets with two unary operations would then be applicable to that structure; but that isn't the same as the theory of L as an algebra. ---------------------------------------------------------------------- You ask why, on the top lines of p.137, I attach importance to \omega^X being a full direct product. If we try to extend the result of the preceding exercise to non-complete lattices L, we find that we cannot in general map any lattice of the form P(X), i.e., 2^X, onto it by a complete upper semilattice homomorphism. But we could if we allowed ourselves to use for our domains complete upper subsemilattices of P(X). (Just embed L in a complete lattice L', find a map f of some P(X) onto L' as in the preceding exercise, and then note that f^{-1}(L) is a complete upper subsemilattices of P(X) which f maps onto L.) So getting a surjective map on a general subsemilattice of a direct product is easier than getting such a map on a full direct product; and my comment notes that we are not taking that easy way out, here. ---------------------------------------------------------------------- You ask whether there is a characterization of "cocompact" elements (p.138, sentence after Exercise 5.2:16) in the lattice of subgroups of a group. Well, one context where such a concept comes up is in module theory. A nonzero module is called simple if it has no proper nonzero submodule, and the submodule of a module M generated by all its simple submodules is called the "socle" of M. One can show that the zero submodule of M is cocompact in the lattice of all submodules of M if and only if every submodule of M contains a simple submodule, and the socle of M is finitely generated. In particular, looking at Z-modules, i.e., abelian groups, it follows that the zero subgroup of an abelian group A is cocompact in the lattice of all subgroups if and only if every element of A has finite order, and A has only finitely many elements of prime order. I believe the above fact on modules over a general ring is somewhat useful in module theory. But I don't know a criterion for a general submodule N of a module M to be cocompact in the submodule lattice. One cannot say that this will hold if the zero submodule of M/N is cocompact in the lattice of submodules of M/N. For instance, let A be the abelian group of exponent p which is free as a Z/pZ-module on a basis x_0, x_1, ..., x_n, ... and B the submodule of A consisting of the elements in which the sum of the coefficients of the above basis elements is 0. Then A/B =~ Z/pZ, so the zero element in the submodule lattice of A/B is certainly cocompact. But B is not cocompact in the submodule lattice of A. To see this, look at the submodules A = A_0 > A_1 > A_2 > ... where A_i is generated by all basis elements x_j with j\geq i. We see that none of the A_i contains B, but their intersection does, proving noncocompactness of B. I haven't thought about the corresponding questions for nonabelian groups. ---------------------------------------------------------------------- You ask how the exchange axiom, (5.4.1) on p.148, is used in showing that bases of a vector space all have the same number of elements. More precisely, that axiom is used when we know that one of the bases is finite. (An entirely different method is used when both are infinite, which calls on the fact that all vector-space operations are finitary.) The idea is as follows: Suppose B_1 and B_2 are bases of V, with B_1 finite. We do induction on the number of elements belonging to B_1 but not to B_2. If that number is 0, we are done, since one basis can't be properly contained in another. If not, let z\in B_1 not be a member of B_2, and let X = B_1 - \{z\}. Since X does not span V, its span (closure) must miss some y\in B_2. Applying the exchange condition to this X, y and z, one can deduce that (B_1 - \{z\}) \cup \{y\} is again a basis of V; and it has the same number of elements as B_1, but more elements in common with B_2 than B_1 did, allowing us to complete the induction. Check out the linear algebra text where you first saw the uniquness of dimensions proved, and see whether the argument there is "essentially" the above. ---------------------------------------------------------------------- You ask about the relationship between the concept of Galois connection in my notes (p.148), and the one at http://en.wikipedia.org/wiki/Galois_connection . Note that in Exercise 5.5:2, I give an equivalent description of a Galois connection, and in the second half of that exercise, I generalize it to partially ordered sets. This corresponds to the "Alternative Definition" in the Wikipedia article, which they call an "antitone Galois connection". Their definition of a "monotone Galois connection" is simply a Galois connection in that sense between a poset A and the opposite of a poset B. ---------------------------------------------------------------------- Regarding Lemma 5.5.1 on p.148, you write > The first couple conditions given in this lemma look like theorems > of intuitionist logic regarding negation: > > A -> B <=> ~B -> ~A > A => ~ ~A > ~A <=> ~~~A > > but without the classic > ~~A => A > > which would be analogous to the lemma condition (ii) being > equality rather than inclusion. Is there a connection here? I don't know. I haven't studied intuitionist logic; but it sounds interesting. Maybe one has a Galois connection on propositions, given by "is incompatible with under intuitionist logic" ... ? ---------------------------------------------------------------------- You ask what I mean by a "linear functional" in Example 5.5.7, p.150. As you suggest, it means a linear map from the vector space to the base field. ---------------------------------------------------------------------- You ask about generalizations of the duality on convex sets that I describe on p.150, Example 5.5.7; in particular, in the case of polyhedra in R^3, mentioned in class. It looks to me as though it should be possible to generalize the duality to nonconvex polyhedra X whose faces don't contain the origin, 0. Namely, write the plane of each face F_i of X in the form f_i(x) = 1 where f_i is a linear functional, and take the vertices of the dual to be these points f_i. Let two vertices f_i, f_j be connected by an edge in the dual if the faces F_i, F_j meet at an edge in X, and let the dual have a face with vertices f_{i_1},...,f_{i_k} if the faces F_{i_1},...,F_{i_k} meet in a vertex in X. This duality wouldn't correspond in any way I can see to a Galois connection, but it looks fairly easy to work with. However, there would be complications: a non-convex polyhedron can have more than one face lying in the same plane, and this would lead to the dual polyhedron having vertices that have to be "counted more than once". So one would have to set up a theory of polyhedra with vertices possibly counted more than once, and if one thinks about it, the same phenomenon for edges, and probably faces. One could also approach the construction more abstractly, using a formal description of a polyhedron in terms of abstract vertices edges and faces, with an incidence relation. I don't know just what properties the incidence relation should be assumed to have, but the properties would probably be self-dual, and so allow dualization. You write that you tried something like that out for a torus, and it seemed to be self-dual. This is probably because the Euler characteristic, V - E + F, is unchanged under interchanging V and F. But this wouldn't work in higher dimensions; first, because in that case the Euler characteristic doesn't completely determine the structure of the manifold, and second, because in even characteristic, dualization would change the sign of Euler's formula. ---------------------------------------------------------------------- Regarding Example 5.5.8, p.150, you ask about my assertion that when X is a ring of abelian-group endomorphisms of M, so that M is an X-module, then X^* is the ring of X-module endomorphisms of M. Well, have you written out, on the one hand, the condition for t\in T to belong to X^*, and on the other hand, the condition for t\in T to be an X-module homomorphism, and compared them? If you did, but don't see why the resulting properties should be equivalent, then send me the conditions you have written down, and I'll say more. ---------------------------------------------------------------------- You ask about the assertion on p.152 that the set of propositions implied by s \vee t is the intersection of the set of propositions implied by s and the set of propositions implied by t. Well, let me know how far you were able to get. There are two parts to such a statement of equality: that any proposition implied by s \vee t must be in that intersection, and that any proposition in that intersection must be implied by s \vee t. Can you prove either one of these statements? (If you have trouble with one of the directions, you might ask yourself "What might examples of propositions s, t and another proposition p for which the desired implication doesn't hold look like?") When you've gotten as far as you can with this, let me know what you see and what you don't see, and I'll help. ---------------------------------------------------------------------- Regarding Galois connections (pp.148-153) you ask under what conditions the closed sets of one of the resulting closure operators will be the closed sets of a topology. Since the class of closed sets under a closure operator is automatically closed under arbitrary intersections, the conditions that have to be satisfied are that the empty set be closed, and that the union of two closed sets be closed. (The latter is condition (b) of Exercise 5.3.15.) There are nice conditions one can assume on a relation R \subseteq S x T that will imply these properties. To make the empty set closed, one can assume that there is an element of T that relates to no elements of S. To make unions closed, one can assume that for any two elements t_1, t_2 \in T, there is an element t_1\vee t_2 \in T, such that the elements it relates to under R are precisely those to which either t_1 or t_2 relates. (Cf. first display on p.152.) This may seem unnatural -- it does not hold in "typical" Galois connections -- but neither do "typical" Galois connections have the property that unions of closed subsets are closed. An example where it does hold, other than languages with an operator \vee and their models, is Example 5.5.6 (points of complex n-space, and the polynomials that are zero at those points). For any two polynomials t_1 and t_2, their product can be used as "t_1\vee t_2"; and the Zariski topology on complex n-space arises from this Galois connection. The sufficient conditions described above are not necessary. For instance, to get the empty set to be closed, it suffices that for _each_ element of S there be an element of T which does not relate to it; and one can similarly weaken the condition that leads to finite unions. However, my guess is that the properties I've described will tend to give the most natural cases where the closed sets under the Galois connection form the closed sets a topology. ---------------------------------------------------------------------- You write that you have heard that some mathematicians remove the existence of identity morphisms (p.158) from the definition of category. I don't recall hearing that. Can you point me to an example? You say that this is as reasonable as treating semigroups and monoids along with groups. I'll agree that it is as reasonable as considering semigroups -- but not that it is as reasonable as treating monoids! The definition of "monoid" embodies the natural structure on the set of endomorphisms of a mathematical object. There are indeed cases where the definition of a semigroup describes a structure that one wants to deal with; but these come up only in more complicated situations: when one wants to look at endomorphisms of a mathematical object that satisfy some restriction which respects multiplication, but isn't satisfied by the identity map. E.g., "all maps of the infinite set X into itself that have finite image", "all non-one-to-one maps of X into itself", etc.. Generally, these are not "stand-alone" examples; rather, they occur as subsemigroups of monoids (in the above two cases, the monoid of all maps of X into itself). Why don't I similarly introduce "nonunital categories"? There are endless tangents one can go off on, and one has to limit what one covers, both for reasons of time, and to give a unified subject matter that the student can absorb. If it seems from the notes that I am inclined to go in all directions, this is illusory -- I give very varied examples of the major concepts (such as "universal constructions") so as to provide a full perspective for understanding them. But in the basic concepts that I am presenting, I try to stick to the important ones, and not throw in less important variants. After learning about categories in the standard sense, the student who has reason to study structures that are essentially subsystems of categories closed under composition but under containing all relevant identity morphisms can easily do so. ---------------------------------------------------------------------- Regarding constructions like G_{cat} for G a group (p.161), and P_{cat} for P a partially ordered set (p.163), you write > ... the emphasis on relating the structure of certain categories > to the structure of mathematical objects such as monoids, groups, > partial orders, etc., is much greater than in previous introductory > texts which I have read ... > > Is this primarily intended as a way of providing many "concrete" > examples ... or ... will it become a useful mathematical tool ... I would say that my primary motivation was neither of these: it was to show that categories are "the same sort of things" as groups, partially ordered sets, etc.. To which I will add that it is equally important to see categories as being "of a different sort", in that they can represent in one entity the vast array of all structures in a field of mathematics. But the way categories are used makes the latter viewpoint clear, while the viewpoint of them as mathematical structures like groups, monoids, etc. often gets overlooked. It is worth having both complementary understandings. Secondarily -- yes, these constructions give a nice class of examples of categories; examples different from the sort that one usually sees. And finally, there are some uses for such examples. For instance, we will see that if G is a group, then an "action" of G on an object of a category C is equivalent to a functor G_cat -> C; so constructions like "the fixed-point subobject of the action of G on X" will be expressible as an instance of the "limit" of a functor. ---------------------------------------------------------------------- You ask about the omission of arrows representing composite morphisms in diagrams of categories; e.g., in the first display on p.163. In general, diagrams that we draw show morphisms that are going to be discussed, and that are not merely composites of other morphisms shown. If we tried to draw all the morphisms in a category, the result would usually be far too complicated and confusing to the eye. Our pictures simply show the key things we need to focus attention on. We don't _always_ omit all morphisms that are composites of others that we show. E.g., in the diagrams on p.37, we showed the diagonal arrows even though they are (after the fact) composites of the horizontal and vertical arrows. But this is because conceptually they were given before the vertical arrows, and the properties characterizing the vertical arrows required the diagonal arrows to state them. So we omit composite arrows when we can. My showing the diagonal arrow in the diagram on p.162 was exceptional -- based on the very introductory nature of this section. ---------------------------------------------------------------------- You ask about naturally occurring examples of composition of relations (p.164, top) other than the case of functions. Well, a lot of the things that are loosely called functions but aren't really can be thought of as relations. In calculus texts one sees "the function 1/x from real numbers to real numbers", but it is not a function because it isn't everywhere defined; and in some contexts one talks about "multivalued functions", such as "+- sqrt x". The obvious way of "composing" these corresponds precisely to composition as relations. Phrases like "is a friend of a client of --" can be thought as the composite of the relation "is a friend of" and "is a client of". But mostly, I would say that if and when one has a question about composition of relations, one can use the definition itself, and gain experience with the concept by applying it in trying to answer one's question. It isn't a major topic of this course, so there will be few such questions here. (I have a preprint which considers, among a number of other structures, the monoid of self-relations on a set, under the composition operation, about which there are some open questions; you can look at it at http://math.berkeley.edu/~gbergman/papers/embed.pdf .) ---------------------------------------------------------------------- You ask about the term "partial operation" used on p.164, in the sentence below the final display. A "partial function X --> Y" means a function from a subset of X to Y. E.g., in Math 1A, when one speaks of "the function 1/x" or "the function sqrt x", these are partial functions from the real line to the real line. A partial binary operation on a set X is a partial function X x X --> X. In particular, if X is the set of all germs of analytic functions at points of the complex plane, then one has a partial operation of composition, since one can sometimes compose the germ of a function f at a point z_1 with the germ of a function g at a point z_0, namely, if and only if g(z_0) = z_1. As stated, these are exactly the cases needed to make "GermAnal" a category. ---------------------------------------------------------------------- I'll somewhat arbitrarily put your question about "empty composites" with material related to p.166, though you actually asked it much later. > ... Is there a nice way to define the composite of a collection > (or sequence?) of functions such that the composite of the empty > collection is the identity map? ... Well, see what you think of this definition. Suppose we are given n+1 objects X_0,...,X_n in a category C, and for 0\leq i < n, a morphism f_i: X_i -> X_{i+1}. We want to give the simplest possible definition of their composite, a function X_0 --> X_n. So we will recursively define \prod_{i=m-1} ^{0} f_i: X_0 --> X_m. The recursive step will obviously be \prod_{i=m} ^{0} f_i = f_m (\prod_{i=m-1} ^{0} f_i). What should we take as the base step? The naive answer would be to make the base step the definition of the product with m=0 by (\prod_{i=0} ^{0} f_i) = f_0. But I would say that a more elegant solution is to define the empty subproduct of this chain of morphisms, \prod_{i=-1} ^{0} f_i, as id_{X_0}: X_0 -> X_0. That way, "f_0" gets introduced at the m=0 recursive step just as each other f_i gets introduced at the m=i step. How does that look? ---------------------------------------------------------------------- You ask how one can allow a member of Ar(C) to belong to more than one hom-set (p.167), given that they are drawn as arrows with definite source and target. The fact that we draw them that way isn't part of the definition of a category! It is simply a convenient way that we picture morphisms. So it is our right to draw diagrams that way that one might question, not whether one can allow morphism-sets to overlap. To the question "How can we justify drawing diagrams with each arrows having a source and target, when a given element may lie in more than one morphism set?", I think the right answer is that the arrow f we draw from X to Y represents f "regarded as a member of C(X,Y)"; and if we want to formalize that concept, we can do it by saying that the arrow really represents the 3-typle (X,Y,f). This is not really different in nature from such questions as how we can justify writing the composite of elements f and g of a group G as fg, given that the underlying set |G| admits many group operations, and the product will be different in one than in another. The answer to that one is that fg is our shorthand for \mu_G(f,g), and it is safe to use such shorthand when we are not explicitly dealing with more than one group-structure on the same set (or on groups with overlapping underlying sets). > ... Also, since the text will not require that hom-sets be > disjoint, what advantages will this give? ... So far as I am concerned, only the advantage of not alienating people who are used to the definition saying that a function f: X --> Y is a subset of X x Y. To such people, the ordinary systems of sets and maps that they are used to would not form categories if we used the more restrictive definitions. Too many categorists don't care about that -- they take the attitude "We know the right way to do things", don't try to make them intelligible to the general mathematical community, and wonder why category theory is underappreciated! But once one is doing category theory, the things one is interested in from one point of view can always be translated into a variant language; so a student who has read this text should have no trouble adjusting to the axiomatics of a text that assumes hom-sets disjoint. ---------------------------------------------------------------------- In connection with the discussion on p.168 of attitudes about categories, you ask whether category theory "has any content", or is, as Wittgenstein said of logic, merely "a tautology". Well, it has been said that all of mathematics consists of tautologies. Insofar as that is true, it is true of category theory in particular. A tautology is a statement that is automatically true; and it is usually thought of as therefore being a statement that is obviously true (such as "X = X"). But a statement can be automatically true without being obvious; and I think the nontrivial results of mathematics fall into this category. So being tautologies does not keep them from being powerful and useful additions to human knowledge. ---------------------------------------------------------------------- You ask whether category theory is essentially a language in which to say things about existing fields of mathematics, or is a field with nontrivial content (p.168). I say, ask yourself that question at the end of this course! ---------------------------------------------------------------------- Regarding the development of the Axiom of Universes on pp.169-170, under which a universe is a set, and every set is a member of a universe, you ask, "Don't these imply that every collection of sets is a set which is a contradiction?" No, nothing in the axioms implies that every collection of sets is a set. Under the axioms, for each set X there is a universe which contains X; but that universe will vary from one set X to another. So there is no assertion that one universe contains all sets. ---------------------------------------------------------------------- Regarding the discussion on p.170, you write, > ... you redefine "set" to "small" and "large" set. Then, later, you > mention the lack of a "set" in ZFC that satisfies being a universe, > though the class of all sets would be. I am understanding this to be > in the old sense? The term "set" will always refer to a conventional > set, or will set be used to encompass large and small set? Actually, the third paragraph of section 6.4 was just meant to lead the reader to the ideas developed in what followed; so when I said "So let us change their names ...", I really meant "So suppose we changed their names ...". As of the next paragraph, we begin formally setting up what we really do. What we talked of loosely as "old sets" and "new sets" are now both sets within the set theory that we are discussing. We no longer need to consider large sets "things we used to call classes"; though we can still say that the members of U form a self-contained set theory, from within which the things not in U look like "classes that aren't sets". I hope this helps. Let me know if you still have difficulty with this. ---------------------------------------------------------------------- Regarding the idea of fixeing some universe U, and considering those objects of a given sort (including categories) that lie in U, as described on p.170, you write: > ... I'm having trouble understanding why we should expect any > set of categories to be in this "standard" universe U. ... If U contains, say, some set X of groups, then it will also contain the category whose objects are the members of X, and whose morphisms are the group-homomorphisms among these. And if U contains a set Y of sets of groups, it will contain the set of categories whose members are the categories constructed as described above from the members of Y. Let me know whether you have any difficulty with proving these statements, and/or if you have difficulty seeing that any universe U will contain sets of groups, and sets of sets of groups. Intuitively, ZFC was set up to handle "everything that mathematicians ordinarily do", and it does this quite well. Forming objects constructed as ordered tuples, sets of mathematical objects, sets of homomorphisms among them, etc., are among these things; so your understanding of ZFC should include, at least in sketch form, an understanding of how these things are done. Since categories are defined as tuples with certain properties, ZFC can handle these equally well; and since every universe satisfies ZFC internally, these properties will hold in any universe. The one Achilles' heel of ZFC is the impossibility of defining "the set of all --", where "--" is not restricted in terms of some given set. The Axiom of Universes partly overcomes this: it allows us to speak of "the (large) set of all (small) --"; and that's what we must do when we define things like "the category of all groups". But if you merely want to get some, and indeed, lots of categories within U, that's easy: Just start with some set of groups etc. within U and as described in the preceding paragraph, form the corresponding category. There are other sorts of categories that arise in ways different from "mathematical objects of a given sort and morphisms among them"; see pp.161, second paragraph, through p.163, middle. These are easy to apply in any universe as well. ---------------------------------------------------------------------- You ask what I mean by a "large lattice" in the middle of p.171. Hmm -- I guess I should make clear in Definition 6.4.4 that the last sentence, "Large will mean not necessarily small or legitimate", applies not just to categories, but to mathematical objects generally. ---------------------------------------------------------------------- Regarding the comment on p.177, just before Definition 6.5.4, that "faithful" and "full" aren't the only analogs of "one-to-one" and "onto" that can be considered for functors, you ask what some of the others are. There are none that come up often enough that I knew their names; but looking online, I see that a functor F: C --> D is called "representative" or "dense" if for every object Y of D there is an object X of C such that F(X) is isomorphic in D to Y; a kind of "onto-ness" property. One could consider the "one-one-ness" property of taking non-isomorphic objects to non-isomorphic objects, but I haven't found anyplace where that is given a name. There are lots of adjectives used to describe kinds of functors; but most of the properties in question are of different sorts from one-one-ness and onto-ness. ---------------------------------------------------------------------- Regarding Definition 6.5.7, p.179, you ask > ... is it meaningful to speak of a hom-functor for a non-legitimate > category? In this case, the C(X,Y) are not sets. Yes they are!! I think you were in class Monday, when I emphasized that "large sets" are still sets within our version of ZFC -- they just aren't members of whatever universe U (itself a set) we happen to be focussing on; but they do belong to some larger universe U' by the Axiom of Universes. As noted in the short middle paragraph of p.172, our version of set theory still has the property that "all sets" don't form a set. But "large sets" that we talk about definitely are sets. ---------------------------------------------------------------------- You ask whether the Galois correspondence (I guess you mean the correspondence between subgroups of a Galois group and intermediate fields) is a functor. Well, it can certainly be made a functor by regarding the subgroups of the Galois group as forming a category with inclusions as morphisms, and similarly for the intermediate fields: they give anti-isomorphic partially ordered sets P and Q, which translates to a contravariant functor (p.180) P_cat^op --> Q_cat. There is a more sophisticated category-theoretic approach to Galois theory which you might find more interesting. I don't know the details, but Lenstra used it in teaching Math 250A one year. A lot of the students found it very difficult, and since I taught 250B the following semester, I ended up having to re-teach them Galois theory the traditional way. But Lenstra's notes from his 250A are online at http://websites.math.leidenuniv.nl/algebra/topics.pdf , and you might want to look at them. I think the Galois theory itself begins around the bottom of p.114, though it depends on lots of ring- and category-theoretic preparation in the preceding sections. ---------------------------------------------------------------------- You note that for the definition of a product category C = \prod C_i on p.184, the Axiom of Choice guarantees that Ob(C), as defined, will have at least one element, but you ask whether it need have any others, and how we can be sure that it is "what we want it to be". Well, by its definition, a product set is always "what we want it to be"; the function of the Axiom of Choice is to guarantee that it will have the properties we expect it to. That axiom tells us here that if the categories C_i all have nonempty object-sets, then so will C. If they each have just one object, then of course Ob(C) = \prod Ob(C_i) will also just have one. You should be able to prove from ZFC that if a family of nonempty sets does not consist wholly of 1-element sets, then their product has more than one element; as well as such statements as that if the family is infinite, and each member has more than one element, then the product set has at least 2^{\aleph_0} elements. ---------------------------------------------------------------------- Concerning the statement on p.191 that the empty set is the initial object of Set, you ask "What is the morphism in C(\emptyset, X)?", We touched on this in reading #1: See p.13, last sentence of next-to-last paragraph, with the key words, "there is exactly one". I didn't go into details there, because I felt that the student who thought this through would see it. You should look at the definition of a function from a set X to a set Y, and ask yourself what satisfies that definition when X is the empty set. If you have trouble thinking this through, ask again. ---------------------------------------------------------------------- Regarding the concept of a free object with respect to a concretization of a category (Definition 6.8.3, p.192), you ask > Is there a generalization of free object for non-concrete categories? Well, a free group (etc.) is a group F(X) with a universal X-tuple of members of its _underlying_set_; so having an "underlying set" is part of the essence of the concept, and the category-theoretic abstraction of the underlying set is a concretization. But one can get various sorts of generalizations depending on how far afield one is willing to go, and still consider the result a version of the "free object" concept. For a small generalization, one can drop the faithfulness condition in the definition of concretization. E.g., the functor taking every group to the set of its elements of exponent 2 is not faithful, but there is an analog of the free group for that functor, namely the functor taking every X to the group presented by an X-tuple of generators, together with relations saying that all those generators (but not necessarily all other elements) satisfy x^2 = e. Much more loosely, one could call the result of any universal construction (or at least, any left-universal construction) a "free" object for the relevant conditions. And some authors do. In between these, one can consider the construction of the left adjoint of a functor, which we will see defined in section 7.3 to be (when it exists) a generalization of a free object construction. ---------------------------------------------------------------------- Regarding the concept of kernel on p.196 (2nd paragraph before Definition 6.8.7, you ask > How closely does the categorical definition of kernel match our usual > meaning? I'm not aware of any categories with zero objects in which > the categorical definition differs from the standard definition, ... My first reaction was that one would have to come up with a pretty exotic category, and there would not be likely to be a "standard definition" of kernel there! However, your suggestion > ... but perhaps one could be concocted by taking some subcategory > of Ab in which not every standard kernel is an object ... works: if C is the category of divisible groups (Exercise 6.7:5, p. 186) then the map Q --> Q/Z has zero kernel in C under our definition, but one would ordinarily say that it has kernel Z, which is not in C. ---------------------------------------------------------------------- In connection with the concepts of pushouts and pullbacks (pp.196-197) you mention having seen pullbacks in algebraic geometry, and ask whether pushouts of schemes is equally important. Well, if you look at affine schemes, pushouts correspond to pullbacks of rings. In particular, given two subrings of a ring, the pullback of the diagram formed from that ring and those two subrings is the intersection of those subrings. But intersection does not respect the properties that algebraic geometers like, such as being Noetherian. (Can you find an example of a finitely generated Noetherian ring and two finitely generated subrings whose intersection is not Noetherian?) Intuitively, forming a pushout of schemes corresponds to gluing two schemes together in a manner prescribed by maps from a third; and this gluing process can create singularities. ---------------------------------------------------------------------- You ask what I mean in the first sentence of Lemma 6.9.3, p.203, by unordered pair of objects. "Ordered pair" is the standard term for the sort of entity that we write (x,y). If I said that there was no more than one morphism between an ordered pair (X,Y) of objects, this might be taken to mean that C(X,Y) had at most one element; but I want to say more: that C(X,Y) and C(Y,X) each have at most one element, and that they can't both have an element unless X=Y. So I use the phrase "unordered pair of objects" to mean "two objects, with no difference in the roles we assign them." I used the same phrase in Exercise 6.2:1, p.162, where I made a precise statement, then used this phrase as an informal translation. ---------------------------------------------------------------------- You ask how one can prove the statement on p.204 that "there is no natural way to make a contravariant functor out of P_f." As I use it, the word "natural" is an informal term, like "obvious" or "reasonable", so it doesn't require a proof. The sentence simply means that none of the ways that we discovered, when we discussed the power-set construction, to turn maps among sets into maps among their power sets, gives a contravariant functor that takes finite subsets to finite subsets. However, you might try investigating whether there is or is not any way to make the construction associating to each set the set of its finite subsets into a contravariant functor, and if you can answer the question either way, hand it in as a homework problem. ---------------------------------------------------------------------- Regarding the definition of equivalence of categores on p.206, you ask: > ... If F is covariant must G be too? And if F is contravariant > G also? Right. More precisely, by the last sentence of Def.6.6.1, p.180, "functor" means "covariant functor" if the contrary is not stated. So the definition of equivalence should be interpreted with both F and G being covariant functors. Then a "contravariant equivalence" between categories C and D is an equivalence (in that sense) between C^op and D. > Is there an example in which FG=Id_D but GF is only isomorphic > but not equal to Id_C? Yes. Let C be a category, and D any skeleton on C. Let F be the functor determined as follows: For each object X of C let F(X) be the unique object of D isomorphic to X in C, and choose an isomorphism f_X: X -> F(X), using the identity isomorphism whenever X \in Ob(D). For h: X -> Y, define F(h) = f_Y h (f_X)^{-1}. (Draw the diagram to see how this works.) Let G be the inclusion functor of D into C. Then you'll see that FG=Id_D but GF is only isomorphic to Id_C. ---------------------------------------------------------------------- > Page 206. Is the last condition in lemma 6.9.5 what some people call > essentially surjective? Yes. I hadn't encountered the term, but doing a Google Book Search, I see that it is used that way. ---------------------------------------------------------------------- Concerning the statement on p.207, after Def.6.9.6, that the Axiom of Choice allows us to construct a skeleton for every category, you ask how we can do this for categories that are not small. It sounds as though you are still thinking in terms of the paragraphs of motivation at the beginning of section 6.4, which suggested that "small" and "large" sets might be used as new names for what had been called sets and classes. But what we moved on to in that section was a set theory in which "small sets" were sets in a given universe, and "large sets" were any sets within our set theory; and in which ZFC, and so in particular, the Axiom of Choice, applied to the set theory as a whole, not just to small sets. (We still have "proper classes", subclasses of the class of all sets, as noted in the middle of p.172, and we can't apply the Axiom of Choice or our other axioms to these. But a "large category" by definition has a _set_ of objects; it belongs to our set theory; it is merely not required to belong to the distinguished universe U within that set theory.) ---------------------------------------------------------------------- You ask about my assertion on p.216 that the functor U^\omega is represented by the free commutative ring on a \omega-tuple of generators. For any ring R, U^\omega(R) is the set of \omega-tuples of elements of R (see p. 195, second sentence of Definition 6.8.5), and the ring with a universal \omega-tuple of elements is the free ring on an \omega-tuple of generators. (See discussion on p. 212 of the free group on 3 generators as the initial object in the category of groups with specified 3-tuples of elements. As discussed in the second paragraph on p. 213 and the theorem that follows, this can be translated as saying that that group is a representing object for the functor associating to each group G the set of 3-tuples of elements of G.) ---------------------------------------------------------------------- Regarding the statement you read somewhere that Cayley's Theorem is a case of the Yoneda Lemma (p.217), you write > ... Since the one object R of G_cat is just an abstract construction > I don't really understand what h_R means in this context. You can call that object "an abstract construction", but the definition of h_R still applies to it -- go to that definition, take R to be that one "abstract" object of G_cat, and see what object of what category h_R takes R to. Then remember that the concept of functor involves both objects and morphisms. So now check what Yoneda's Lemma says about morphisms in this case. ---------------------------------------------------------------------- You ask about the reason I reserve the word "free" for the construction of the left adjoint of an underlying-set functor (e.g., p.224 bottom), while other books you have read use it for more general left universal constructions. I can't be sure about the books you are referring to without knowing which they are and looking them over, but I suspect that they do not develop or assume known to their readers the general concepts of universal constructions, representable functors, and adjoint functors. I suspect that if they did, they too would use those terms in many places where they now use "free". Of course, since the word "free" is short and suggestive, there still might be a temptation to use it in place of the more technical-sounding terms. But despite this, I think that when one has a more precise language available, one will use it. It's no fault of those authors -- they are writing to an audience not familiar with the concepts of this course. ---------------------------------------------------------------------- Concerning the examples of adjoints given on p.225, you write > Many of our universal constructions create some sort of "free object" > (free group on a set, free ring on a monoid, "free ring" (tensor ring) > on an abelian group, etc.). ... Well, people use the word "free" with various degrees of generality. The formal definition we have given corresponds to a left adjoint of a set-valued functor, and does not cover things like the monoid ring on a monoid, or the tensor ring on an abelian group. On the other hand, using the term still more widely, one often calls a group etc. presented by generating set X and relation-set R "the group freely generated by elements x of X subject to the relations R", though this does not correspond to an adjoint functor, since only one object and not a family are being constructed. Likewise, as noted in the reading, the tensor product construction, though it gives an abelian group "freely generated" by the image of a bilinear map on given groups, is not a left adjoint. So I would say that what you are referring to as "free" objects could be described by the term "left universal constructions", as in the text; and that a large class of these, but not all, are covered by the concept of left adjoint constructions. ---------------------------------------------------------------------- You ask how I translate the question at the bottom of p.225, of whether the product functor C x C --> C had a left adjoint, into the universal-object question on the next page. I am essentially using the characterization of adjoint pairs of functors in Theorem 7.3.7(ii), in which one starts with the right adjoint U, and characterizes the left adjoint F object-by-object in terms of it. So if the product functor is to be a right adjoint, it will have the role of U, and the "C" and "D" of the theorem will be our C and C x C respectively; and the desired condition is that for every object of C (which I call X on p.226) there should exist a pair (R_X, u_X) representing the functor C(X, U(-)). Writing R_X as (Y,Z), this means a pair of objects Y and Z with a universal morphism of X into Y x Z. ---------------------------------------------------------------------- Regarding the statement on p.226 (in the short paragraph after the display) that pairwise products/coproducts in Ab give a "cyclic" diagram of adjoints, you ask what this means. By "cyclic" I mean "repeating itself cyclically". ---------------------------------------------------------------------- Regarding the p-adic numbers (p.230) you write > I am somewhat familiar with the Hasse local-global principle, or at > least the result, that an equation solvable over all p-adics and over > the reals has a solution over the rational numbers. Does this result > arise in any way from the sort of constructions we have encountered? > I'm curious if the p-adic rings, themselves inverse limits, have any > similar relations which are useful in proving such results. Well, there's an obvious approach to looking for necessary and sufficient conditions for something to hold: One puts together all the necessary conditions one can come up with, and hopes that when one has listed enough of them, their conjuction will be sufficient as well, and that one can express that conjunction in some concise form. Necessary conditions for an equation to have a solution in the integers are that it not contradict anything one can deduce either using congruences, or using inequalities. Consistency with what one can deduce using congruences comes down to solvability in each Z/nZ. But Z/nZ is isomorphic to the direct product of the Z/p^i Z for p^i ranging over the maximal prime-powers dividing n; so having solutions modulo all integers is equivalent to having solutions in all rings Z/p^i Z. For fixed p, the conditions of having solutions in Z/p^i Z for various i are not independent, due to the homomorphisms (7.4.3); but the conjunction of these conditions over all i is equivalent to the existence of a solution in the inverse limit of (7.4.3), i.e., the p-adics. So the conditions one can get using congruences can be concisely summarized by saying "there exist solutions in the p-adics for each p". On the other hand, the study of the equation via inequalities comes down to asking whether it has a solution in the reals. The principle you describe evidently says, "Yes, congruences and inequalities are together enough to tell whether an equation has a solution." Our construction of the p-adics can be thought of as a way of condensing the information about integers arising from the study of congruences. As we noted, the p-adics have no zero-divisors, though each Z/p^i Z does; so this way of condensing information brings a kind of elegance. (You state the principle for solutions over the rationals rather than the integers. I guess for that case one would look at "congruences of rational numbers modulo fractional ideals", and use the p-adic field in place of the p-adic integers. By clearing denominators, I think questions of solutions in the rationals can be reduced to questions of solutions in the integers, and the use of the p-adic field reduces the use of the p-adic field to the use of the p-adic ring.) ---------------------------------------------------------------------- Regarding germs of functions, mentioned in the first paragraph of section 7.5, p.233, as a motivating example for direct limits, you ask why "germs" are so called. My guess is that the term arose in complex analysis. There, if one knows an analytic function in a neighborhood of a point, no matter how small, that determines uniquely its extension to as large a connected domain as it can be defined on. The original meaning of "germ" is "sprout" (what one gets when a seed germinates); so I think the idea was that a "germ" of an analytic function was a tiny thing that had enough information to determine the whole thing. When one considers general continuous functions instead of analytic functions, the analogous entities no longer have the property of determining the value away from the point in question, but the word was probably carried over because the concept was useful. (And because the everyday sense of "germ", which refers to a microscopic entity, made it seem natural.) ---------------------------------------------------------------------- Perhaps in connection with Mac Lane's use of "complete", mentioned on p.243 (sentence before Exercise 7.6:2), you ask > For any category, can we construct a "completion" category > with all limits? There are such constructions, but I'm not familiar with the details. One obvious approach is to start with the Yoneda embedding of C in Set^{C^{op}}, note that, like any category of the form Set^D, the latter has limits, and close the image of C in that category under such limits. Another is to take a category whose objects are formal limits (one for each diagram whose limit one wants to allow), and let it have just those morphisms that the universal properties of limits require. Whether these construction would give the same result, I don't know. A problem is that a category may already have some limits, but our construction might create new limits that don't agree with these. For example, the category Set clearly has inverse limits; but suppose we embed Set in the category HausTop of Hausdorff topological spaces by giving each set the discrete topology. Now consider the system of sets which, for convenience, I will write as ... -> Z/8Z -> Z/4Z -> Z/2Z -> 0. (We're not interested in the group or ring structure; just in the fact that as one goes back a step, each point bifurcates.) In HausTop, its inverse limit is the Cantor set, a compact space. In Set, its inverse limit is the same space with the discrete topology. If we construct a universal completion of Set, then since HausTop is complete, the inclusion of Set in HausTop will induce a functor from our universal completion to HausTop which maps the inverse limit of the above system to the compact Cantor set, while since the set-theoretic Cantor set was already in Set, this will be mapped to the discrete Cantor set. Hence the limit of the above system in the universal completion will not be the same as its limit in the original category. Of course, one might choose to construct a completion desigend to preserve those inverse limits that already existed, and which would be "universal" only for functors that preserve such limits. Looking online, I see that Grothendieck defined a completion "pro-C" for every category C. (For the "pro-", see p.240, paragraph before Ex. 7.5:18.) At some point I'll have to learn more about these things, for a paper I've been putting off writing for years. But not during a semester when I'm teaching. (Incidentally, where you wrote "limits", I'm not sure whether you meant inverse and/or direct limits in the sense of section 7.5, or the more general limits and/or colimits in section 7.6; but similar considerations should apply to both, though the details of the category one got would differ.) ---------------------------------------------------------------------- Regarding Proposition 7.6.3 on p.244, you note > ... You write C^D(\Delta(-), F) : C^{op} --> Set, but why need the > target be sets? Is it assumed that C^D is legitimate? The C^D(\Delta(X), F) will be sets, all right, but I agree that they may not be small sets, unless we assume D small. Thanks; I'll put a note about this on the errata page. ---------------------------------------------------------------------- Regarding Exercise 7.6:5(ii), p.246, you ask how an initial object, defined in terms of morphisms to other objects, can be characterized as a limit, whose universal property asserts the existence of morphisms from other objects. If you look at the definitions of limit and colimit, you will see that each of these involves morphisms both into and out of the object in question. (Those morphisms have different roles in the definition; so the exercise requires relating morphisms that have one role in one definition to morphisms having a different role in the other.) ---------------------------------------------------------------------- Meant to answer your question in class, but I fell behind schedule in covering the earlier material. You asked, in the context of Theorem 7.8.8, p.252, whether one of the composite colimits in (7.8.9) could exist, and the other not exist. Yup. The example I've come up with is a direct limit, indexed by the natural numbers, of coequalizers in the category of finite sets. Let the n-th coequalizer diagram have for domain the integer n = \{0,...,n-1\} and for codomain n+1 = \{0,...,n\}, and let the two maps from the domain to the codomain be i |-> i and i |-> i+1. You should check that for each such diagram, the coequalizer is a 1-element set. On the other hand, let the n-th such diagram be mapped into the n+1-st by inclusion of domains and inclusion of codomains. Then if we take the direct limit over n of either domains or codomains, the result "wants to be" the set \omega, but this doesn't lie in FinSet; and it is not hard to show that no object of FinSet has the universal property of the desired direct limit; i.e., it does not exist; so of course there's no way to construct the coequalizer of these (nonexistent) direct limits. On the other hand, if one first takes coequalizers, getting 1-element sets, one sees that a direct limit of these exists, namely the 1-element set; so in that order, the colimit of colimits exists. ---------------------------------------------------------------------- You ask why I didn't simplify Definition 7.8.1 by putting Definition 7.8.12 (p.255) before it. The main reason was that I felt the details of Definition 7.8.1 emphasized the concept; and I feel that it is often good to have to deal with a concept "by hand" before introducing the machinery that handles it slickly -- one appreciates the machinery, and has a better sense of what it does. There's also a technical reason. The morphism of Definition 7.8.12 only makes sense if we assume both (co)limits exist. But if we merely assume Lim S exists, then the cone of Definition 7.8.1 exists. In that way, 7.8.1 is simpler than 7.8.12: it simply merely assumes Lim S exists, and the condition it gives is that F(Lim S) be the desired limit of FS. ---------------------------------------------------------------------- You ask about the statement preceding the first display on p.258, that by the construction of direct limits in Set, there exist D(i)\in P and an element x_i satisfying that display. Well, we have just shown that the left-hand side of that display lies in the direct limit over D of the sets B(D,E_i). The construction of direct limits in Set describes that direct limit as an image of the disjoint union of the sets B(D,E_i). What I don't say in Lemma 7.5.3, but is implicit, is that the maps of the given sets into that image constitute the coprojections; in this case, t he q(D,E_i). So the desired element must be the image of some element of some set B(D,E_i) under its coprojection. For each i, we can name the "D" indexing that set as D(i), and we can name the element in question in that set x_i. That gives the display you asked about. ---------------------------------------------------------------------- You ask what I mean in Exercise 7.9:3(i), p.259, by "failure of one-one-ness" and "failure of surjectivity". The condition that some limit and some colimit commute means that the comparison morphism is an isomorphism; i.e., in the case of the category Set, a bijection. So the two ways it can fail are for that morphism not to be one-to-one, and not to be onto. The exercise asks for examples of both cases. ---------------------------------------------------------------------- Regarding the lines preceding Corollary 7.9.6, p.260, you ask what is meant by a good finite subset. The word "good" is in quotation marks, so I am not using it in any standard meaning. Rather, by a "good" finite set I mean a set that satisfies some properties that will be helpful in getting the desired conclusion. What those properties are is seen in the statement of the Corollary. ---------------------------------------------------------------------- You ask (if I understand correctly) whether one can get the uniqueness of adjoints using the construction of Theorem 7.10.4, p.265. I don't think so. One of the marvelous facts about objects with universal properties is that one can obtain them in different ways, yet the different objects so obtained must be naturally isomorphic. But the ways of constructing them don't give those isomorphisms; that comes from the universal property. But your question adds the words "or other characteristics of possible adjoints"; and to this, one can sometimes give a positive answer. For instance, though no one in this class has yet seen what natural property the groups of (2.3.3) and Exercises 2.3:1-2 are universal for, once one does, the method of construction as a subgroup of a direct product does give one a bound on the order of the groups with that property. And studying which elements of the direct product lie in the subgroup, one can further improve that bound. ---------------------------------------------------------------------- You ask about the discussion in the last two paragraphs of p.273, where I state that the concept of "Cat-based category" is invariant under reversing order of composition. First, let me be explicit about what this means: It means that if C is a Cat-based category, then C^{op}, defined by letting C^{op}(P,Q) = C(Q,P), is also a Cat-based category if one takes the same concept of morphisms-among-morphisms that one had before. (It also becomes a Cat-based category if one reverses the directions of morphisms-among-morphisms; but let's just look at one modification at a time.) The point is that if one has a concept of composition, given by morphisms C(Q,R) x C(P,Q) -> C(P,R), then one can regard these as morphisms C^{op}(R,Q) x C^{op}(Q,P) -> C^{op}(R,P), and the left-hand side can be rewritten C^{op}(Q,P) x C^{op}(R,Q), giving precisely the kind of map one needs to define a category structure C^{op}. As for the composition of morphisms among morphisms, this is based on the "op" functor, which goes from Cat --> Cat, not from Cat to Cat^{op}. Anyway, I suggest that to get the idea without the complications of Cat-based categories, you look at the very last (3-line) paragraph on the page, which talks about the "op" construction on ordinary (Set-based) categories, and note that the opposite of an ordinary category is again an ordinary category, not a Set^{op}-based category. ---------------------------------------------------------------------- Regarding the last paragraph of p.273, you ask "why do we need the product of Set to define "op" functor?" We use the product construction in Set in defining the concept of category, since "composition" is defined by maps C(Y,Z) x C(X,Y) -> C(X,Z) (where "x" here stands for "direct product of sets"). So when we define the opposite of a category, we have to see how to take such maps defined on product-sets of C 's hom-sets and turn them into maps on product-sets of C^{op} 's hom-sets. To fit the changed domains and codomains of our maps, this turns out to require reversing the order of "C(Y,Z)" and "C(X,Y)", and that uses the symmetry of the direct product of sets. ---------------------------------------------------------------------- You ask, regarding something on p.277 (the fact that Prop.8.1.6 refers only to limits? The paragraph before Lemma 8.1.7?) "What goes wrong in the construction of colimits of algebras?" To answer that, I need some idea of how you think they would be constructed. So please let me know that, and I'll reply! ---------------------------------------------------------------------- Regarding Lemma 8.2.2, p.282, you ask whether, when \gamma is singular, the sequence of subsets S^{(\alpha)} always continues to grow past S^{(\gamma)}. Certainly not! For the easiest case, one can start with X = A; and then all S^{(\alpha)} are equal to A, i.e., the growth stops at the first step. By other constructions, one can obtain examples where the growth stops at any specified ordinal less than or equal to the value given by Lemma 8.2.2. ---------------------------------------------------------------------- Regarding the concepts of identity and variety (pp.289-290), you ask > ... Could we interpret \Omega-algebras satisfying sets of identities > as structures satisfying certain first-order theories in languages > possessing only function symbols in their signatures? ... It's not only the signature that has to be restricted, but also the syntax: No existential quantifiers, no negation, no implication, no disjunction. Just universally quantified equations. (One could allow conjunction, since having a conjunction in one's theory is just equivalent to having each of the conjoined identities; but esthetically it seems nicest to leave conjunction out here.) Certain slightly more general languages give classes of algebras that can also be treated nicely. For instance, if one allows, along with sentences of the preceding sort, universally quantified sentences of the form (conjunction of equations) => (equation), then one finds that one still gets free algebras and algebras presented by generators and relations, and that a class of algebras defined by such sentences is closed under almost all (but not quite all) of the operations discussed in today's reading. Such a class is called a "quasivariety of algebras", and these are also studied in universal algebra. (See p. 3, next-to-last line.) An example of a quasivariety is the class of torsion-free groups; another is the class of rings without nonzero nilpotent elements (elements satisfying x^n = 0 for some n). ---------------------------------------------------------------------- Regarding section 8.4 (pp.289-297) you ask > What sort of questions in specific theories like group theory or ring > theory do the results of this section help us to answer or rephrase > in a more manageable manner? None occur to me. I would say the question is like asking, "What sorts of problems about the complex numbers does the concept of a field help us answer?" The value of that concept is to generalize from specific structures, like the complex numbers, general properties that can be studied in a much larger class of cases. Results can then be -- and are -- proved about fields in general, and applied to the complex numbers in particular; but those results could, in principal, have been proved for the complex numbers alone, if we were sure that that field was all we would ever care about. Our goal from the beginning of the course was to see what was the common context in which results for varied classes of algebras such as groups, rings, etc., regarding free objects, construction by generators and relations, consideration of subclasses determined by identities, etc. could be proved. And we have just done that. The results of this reading may look dull, because they are things that we already knew for the specific classes of algebras that we are familiar with. In subsequent sections, we shall get past these "dull" basics, and the material developed will hopefully look more interesting. ---------------------------------------------------------------------- Regarding section 8.4 (pp.289-297) you note, > In chapter 1, one exercise gives an alternative formulation of the > concept of group: we can get away with just the one 2-ary operation > > delta(x, y) = xy^-1. > > This gives us two varieties that are basically the same: ... Actually, the description of groups in terms of the operation delta does not quite give a variety; if one just uses identities, then the variety one gets consists of structures corresponding to groups, and also an empty structure. The category of groups is equivalent to the subcategory of nonempty algebras in this variety. However, that quibble (which one can get around by throwing into the above variety a zeroary operation e and an identity delta(x,e)=x, or delta(x,x)=e) doesn't invalidate the point you make: > ... We could say the varieties are isomorphic as categories, but > that doesn't say it all, so is there a term for this "stronger than > isomorphism but not quite equality" of varieties? I'm not sure whether there is a standard term. In the language of section 8.9-8.10, which we will read in about a week, the relation is that the two varieties of algebras have isomorphic "clones of operations". We will see that two varieties are related in this way if and only if there is an equivalence between them that respects underlying-set functors. Birkhoff named two algebras (as distinct from {\em varieties} of algebras) of possibly different types "crypto-isomorphic" if (in the above language) their clones of operations were isomorphic, which is equivalent to saying that the varieties they generate are equivalent in this way, via an equivalence that takes one algebra to the other. ---------------------------------------------------------------------- Regarding section 8.4 (pp.289-297) you ask, > ... Is there a general way in \Omega-Alg to express the things > where a property exists for some element. For example the groups for > which there is an element of order 5, but not necessarily satisfying > the identity x^5=e for all x? Well, elements g\in G satisfying g^5 = e correspond to morphisms from to G. Such an element will have order 5 if the morphism does not factor through the natural map to the trivial group. So one can describe groups having elements of order 5 by the condition that they will have such a non-factoring morphism from . But it isn't a very natural class of groups -- it isn't closed under homomorphic images or subalgebras; and in general, one can't do universal constructions in it. Perhaps the fact that one can describe it is what you are looking for; but giving such classes a name isn't likely to be helpful. There are two directions one can go from there to get more natural classes. One is to look at the category consisting of groups with a distinguished element g satisfying g^5 = e (dropping the requirement that this element not equal e). This is essentially a variety of algebras: We take the description of the variety Group, and adjoin one additional zeroary operation, specifying the distinguished element, and one additional identity, saying that the new element should have exponent 5. Homomorphisms between such structures should, of course, take distinguished element to distinguished element. The other is to to consider groups _not_ having any elements of exponent 5. That is not, I think, equivalent to a variety; it is defined by operations, and identities, and the additional condition (\forall x) (x^5 = e) => (x = e). Classes of algebras defined by such conditions -- identities together with universal equational implications -- are called "quasivarities", and they have properties nearly as nice a varieties. This quasivariety is really the opposite of what you asked for (groups with an element of order 5); but perhaps you'll find it as interesting as what you asked for. I guess there's a third answer to your question. Model theorists consider any sort of family of first-order sentence, and look at the class of "models" of that family, and call them axiomatic model classes. So if we let the family be the identities for groups, together with the sentence (\exists x) (x^5 = e) and not-(x=e) then the resulting axiomatic model class is what you are asking about. But in considering such classes, model theorists are pretty far from universal algebraists. ---------------------------------------------------------------------- You ask about the terms used in display (8.5.4) on p.299, in particular, the terms "model" and "first-order theory". Model Theory studies structures consisting of a set given with a family of relations of various arities on it, some of which may be operations, and statements about it which can be expressed by formal sentences constructed using element-symbols, relation symbols corresponding to the given relations, and logical operations such "implies", "for all", "there exists", etc.. A "first-order sentence" allows these operations, but with "for all" and "there exists" applicable only to elements, not to relations. (So a statement like "X is infinite", though it can be expressed by saying "there exists a set of pairs of elements which satisfies the conditions to be a function X -> X that is one-to-one but not onto", is not equivalent to a first-order sentence.) Given a language involving certain relation-symbols, a "model" for that language is a set given with operations corresponding to the operation-symbols of the language; and the "theory" of a model or family of models is the set of those statements in the language which are true for all these models. A model of a theory is any model which satisfies all the sentences in the theory. Our concept of a variety is a restricted case of these concepts: The only relations we consider are operations, and the only sentences we consider are universally quantified equations (so, nothing with "not", "or", "there exists", etc.) ---------------------------------------------------------------------- Regarding the discussion of Vopenka's principle on p.299, you ask what the "special properties" of the cardinal mentioned in the discussion there are. I'm afraid you'd have to ask a logician that! Sorry. ---------------------------------------------------------------------- In connection with Prop.8.6.3(iv), p.303, you ask why the endomorphism extending the set map v is not assumed to be unique. The condition "F is generated by the image of X" forces uniqueness. (In this context, it is strictly stronger than uniqueness: E.g., in the situation of Exercise 8.6:4(i), uniqueness holds, but the monoids in question are not free objects in a _subvariety_ of Group. So since we have to state the stronger condition of being generated by v(X), there's not point in bringing in the implied condition of uniqueness of extending homomorphisms. ---------------------------------------------------------------------- In connection with Exercise 8.6:8, p.305, and the following discussion, you ask > Is it easy to find examples of rings that satisfy S_{2d+1} = 0 but not S_{2d} = 0 ... ? ... Hmm -- fiddling around, I think that S_{2d+1} = 0 is equivalent to S_{2d} = 0. Namely, if in S_{2d+1} we substitute 1 for, say, the last indeterminate, and evaluate S_{2d+1}(x_1,x_2,...,x_{2d},1), we find, I think, that those terms where the "1" appears in the last position give S_{2d+1}(x_1,x_2,...,x_{2d}), those where it appears in the next to last position give the negative of this, those where it appears in the third from last position give the S_{2d+1}(x_1,x_2,...,x_{2d}) again, and so on; and since there are an odd number of positions, we get exactly S_{2d+1}(x_1,x_2,...,x_{2d}). So the identity S_{2d+1} = 0 implies S_{2d} = 0. > ... Is there a theory of associative algebras with an extra > S_n-like operation, akin to Lie algebras? Not that I've heard of. > ... Is "S_4 = 0" the simplest nontrivial identity satisfied by the > 2 by 2 matrices? I believe it is the polynomial identity of lowest degree; but an identity that one may find conceptually simpler is (XY-YX)^2 Z = Z (XY-YX)^2; i.e., the square of every commutator is in the center. This can be proved for matrices over the complex numbers by noting that XY-YX has trace 0, and verifying that every matrix with trace 0 has square a scalar matrix. (That fact in turn can be "seen" from the fact that among matrices with a given trace, in particular among those with trace 0, the diagonalizable ones form a dense subset, and it is clear that the square of a trace 0 diagonalizable 2x2 matrix has scalar square.) Knowing the result for matrices over the complex numbers, and the fact that the field of complex numbers generates the variety of commutative rings, one can deduce that the identity holds for matrices over any commutative ring. ---------------------------------------------------------------------- You ask about the type \Omega of the variety of Lie algebras (p.307). The list of operations begins with the structure of k-module, which has one 0-ary operation (the element 0), one binary operation (addition), and a unary scalar-multiplication operation for each element of k. (If one builds up the concept of a k-module starting from that of an abelian group, then one also has the unary operation of additive inverse, and I list that in (8.7.2). However, that is equivalent to the operation of multiplying by -1\in k, so it can be omitted, as I have done above.) Finally, in addition to all of these, there the binary operation of Lie bracket. ---------------------------------------------------------------------- Regarding the concept of a Lie algebra over a commutative ring k (p.307), you ask > ... What falls apart if k is allowed to be noncommutative? There is no appropriate way to define a bilinear map on a module over a noncommutative ring. In other contexts, the "correct" noncommutative generalization of a bilinear map of modules is to consider a right R-module M and a left R-module N, and consider a map b: M x N --> A, where A is an abelian group, and f(xr, y) = f(x, ry) for all y\in R. (More generally, M can be an (S,R)-bimodule, N and (R,T)-module, and A an (S,T)-bimodule, for rings S and T which may or may not equal R, and one can add the conditions f(sx, y) = sf(x, y), f(x, yt) = f(x, y)t for s\in S, t\in T.) However, condition (8.7.5), i.e., [x,y] = -[y,x], means that we can't distinguish "right factors" from "left factors" in the operation of a Lie algebra; and there is no decent version of bilinearity of a map of modules over a noncommutative ring that makes sense without such a distinction. But there is a concept that does combine a Lie algebra with a noncommutative algebra; I will mention it briefly in class: A "Poisson algebra" (over a commutative ring k) is a k-module together with two multiplications, one associative and one Lie, such that Lie bracket with every element is a derivation on the associative algebra structure. ---------------------------------------------------------------------- In connection with the discussion on pp.310-311 of how every Lie group gives a Lie algebra, you ask whether the reverse is true. Every finite-dimensional Lie algebra over the real numbers arises from a Lie group (necessarily of the same dimension). That group is not unique, but for any such Lie group, its universal covering space is again a Lie group, and is the unique simply connected Lie group, up to isomorphism, which gives the indicated Lie algebra. For the infinite dimensional case, one would need to specify a concept of infinite dimensional manifold to use in defining infinite dimensional Lie groups. I know that people do work with such concepts, but there are varied choices of definitions, and I don't know for what choices, if any, such a result has been proved. Infinite dimensional Lie algebras are perfectly natural algebraically, however, as one can see from the part of this section up to where the relation with Lie groups is introduced. ---------------------------------------------------------------------- Regarding the concepts of a clone of operations (p.314) and a clonal category (p.317), you ask whether these concepts have applications to other fields of mathematics. The idea of a "clone of operations" is a generalization of specific concepts like "the set of all derived group-theoretic operations", "the set of all derived lattice-theoretic operations", "the set of all continuous operations on a topological space", etc.. These specific concepts are looked at in group theory, lattice theory, topology, etc.. General Algebra (or Universal Algebra) is the field of mathematics where one abstracts from these specific situations and looks at what one can say about such things in general; so from that point of view, the concept of clone "belongs to" General Algebra. People in other fields may or may not find it valuable to put the results they obtain about the objects that they look at in such a more general context; insofar as they do, they will find the language of general algebra useful. Some theoretical computer scientists have been happy to adopt concepts from Category Theory and General Algebra, including that of a clone of operations. Whether the use they make of these is in fact valuable, I don't know. ---------------------------------------------------------------------- You ask about the two definitions of hyperidentity referred to on p.322. These are very different concepts; it is unfortunate that they are given the same name. I find the definition in which the identities are assumed only for the primitive operations an unnatural one: the choice of which operations to regard as the "primitive" ones is just a matter of convenience when defining a mathematical concept. (E.g., one _could_ define "group" using only the operations (x,y) |-> x y^{-1} and e, and then that operation would become primitive while multiplication would no longer be.) So to require that "all primitive operations" satisfy some identities seems very arbitrary. > ... as far as I can tell, the distinction does not matter in > Exercise 8.9:14. It does. If we replace (a) by the statement that all primitive unary operations are equal, this would not imply the same for all derived unary operations. For instance in Group, there is only one primitive unary operation (the operation of inverse), so all primitive unary operations are equal; but the derived unary operation x . x is not equal to the one primitive unary operation. What you may have noticed is that the significance of (b) is unchanged if we replace "primitive" by "derived". That is true precisely because (b) is equivalent to condition (a), which is a hyperidentity in the sense I use. ---------------------------------------------------------------------- Regarding the last paragraph on p.323, you ask > Does this construction correspond in any way to an enriched structure > on the category C? I think that a functor \box of the indicated sort is the kind of thing one uses in defining a "category over C", i.e., a category D whose morphism-sets are objects of C, and whose composition operations are morphisms D(Y,Z) \box D(X,Y) --> D(X,Z). So it's not C that is an enriched category, but the resulting categories D. However, the assumptions one has to make on \box for this discussion may be weaker than those needed to define enriched categories. I don't know much about operads -- I've just seen the concept sketched, and in these paragraphs, I'm pointing the interested reader to something that he or she might want to look into. ---------------------------------------------------------------------- Concerning the second line after (9.1.5) on p.332, you write > ... you say that $x, y \in |SL(n, A)|$ are the images of associated > universal element $r\in |SL(n, R)|$ under homomorphisms > $f, g : R --> A$. You mean V(f)(r) = x and V(g)(r) = y? Strictly speaking, yes. However, here I am writing informally, and given a homomorphism f of rings, and a matrix M over the domain of f, one can speak of "applying f to M," meaning that one applies it entrywise. Hmm, maybe if I change "under" to "via" it would at least make clear that the reader has to think about how x and y are obtained from f and g. ---------------------------------------------------------------------- Regarding the concept of cogroup, sketched on p.332, you ask > Do such things as cogroups arise naturally in other contexts than > looking at the existence of an adjoint? This just depends on what one considers a "good motivation". I find the question "which functors have adjoints?" to be of great interest, so I motivate coalgebras in these terms. Alternatively, one can say "Many of the most basic constructions of algebra, when regarded as set-valued functors, turn out to be reprsentable. Yet they also often have algebra structures (e.g., the construction SL(n)). How does such algebra structure arise?" As sketched in section 9.1, it arises from a coalgebra structure on the representing objects. ---------------------------------------------------------------------- Regarding the development of V-algebra objects of a general category, i.e., objects analogous to systems of algebras defined by operations and identities, on p.333 et seq. (and also the fact that we developed the theory of varieties, and not of more general classes of algebras, in earlier sections), you ask > ... Why do we stop at identities -- why not proceed to arbitrary > predicate calculus statements, then to arbitrary \Pi_1 statements ... Two reasons. The first was illustrated by the exercise in Chapter 2 showing that there do not exist free fields, and the exercise in Chapter 3 to show that a commutative ring R does not, in general, have a universal homomorphism to an integral domain. We are looking for constructions with universal properties, and classes of models of arbitrary sentences in the predicate calculus (e.g., the sentences defining fields, and integral domains) do not typically admit these constructions. The class of models of a family of universally quantified equations, i.e., identities, does. Secondly, not every sort of sentence in the predicate calculus has a reasonable category-theoretic translation. There are ways to remedy each of these problems. One can investigate what sorts of sentences, other than identities, do yield classes of models allowing universal constructions, and develop the appropriate topics -- yielding "quasivarieties", "prevarieties", and related concepts. And one can look for special classes of categories for which one can define the analogs of models of general sentences, leading into the theory of "topoi". The first set of concepts would allow us to generalize the material of this chapter; and if I ever find time to write further chapters, I intend to introduce it; but they wouldn't be able to fit into the scope of a 1-semester course (unless we left out a lot of other things we've done). On the other hand, as to using a topos as the category in which we define our algebra objects, this would be very restrictive. E.g., the opposite of the category of commutative rings is not a topos, so our expression of SL(n,-) in terms of a group object in that category could not be expressed in this context. If we wanted a context in which we could do these things, it would have to be one which would exclude most of what I center this course around. This could be a supplementary topic for a continuation of this course, but not one that subsumes the what we have been doing. Incidentally, I use the concepts of quasivariety and prevariety in a recent preprint, http://math.berkeley.edu/~gbergman/papers/pv_cP.pdf , for which I recall the definitions (in section 2); you might find that paper interesting to look at. ---------------------------------------------------------------------- Regarding the concept of algebra object in a category (p.334), you ask how we will be using it, saying > ... Even though I read the whole section, I am not grasping the point > of having this concept...I feel like it is going back and forth. Perhaps the following will help. First think of algebra objects, not in terms of the question "How will we be using them?", but as an answer to the question, "How can we take the concept of an algebra, which is defined as an object of the category Set together with certain additional structure, and get a generalization with Set replaced by a more general category C?" The answer is quite straightforward: In the usual concept of algebra, we have a set |A| with some maps |A| x ... x |A| --> |A|, which we call operations; so for the modified concept, we will assume the category C has finite products, take an object of C which we will call |A|, and define "operations" to be morphisms |A| x ... x |A| --> |A| in C. As for the material going "back and forth"; what you have to see is why it does so. After we define the concept of an algebra object in a category C, we find that it leads to a family of algebra objects in the usual sense: Just as every object X of a category C allows us to create a large family of sets, namely the hom-sets C(Y,X) for the different objects Y, so we find that an "algebra A in C" leads to a family of "algebras in Set", i.e., algebras in the traditional sense, namely, the sets C(Y,|A|) with algebra structures induced by the algebra structure of A, via the universal property of direct products. But the algebra object A in C is not these algebras; they can be thought of as its "shadows" in the category Set. However, we can study it using these "shadows"; in particular, we prove that A will satisfy the diagrammatic conditions corresponding to any identities if and only if these "shadows" satisfy those identities themselves. ---------------------------------------------------------------------- Concerning Definition 9.3.5 on p.339, you write > ... I am almost sure this representability is not equivalent to > the representability of section 7. It's a generalization thereof. Part (ii) of definition 9.3.5 says in effect "the functor must have the property that, if you forget the operations, it gives a representable functor in the sense of chapter 7". If V = Set, which can indeed occur, since sets are just algebras with no operations, then there are no operations to forget, and part (ii) then says that in this case -- the case to which the definition of chapter 7 applies -- the definition agrees with that of the chapter. (This is like the question "does the definition of multiplication of complex numbers conflict with the definition of multiplication of real numbers?" No, because when restricted to real numbers, it gives the same operation.) ---------------------------------------------------------------------- Concerning Definition 9.3.5 on p.339, you write > I am not sure why Rep(C,V) is a full subcategory. To define a subcategory Y of a category X, one must specify its objects and its morphisms. If one begins by specifying the objects, and also says it is to be a "full subcategory", this means that for two objects that belong to Y, the morphisms between them in Y are to be all the morphisms they have between them in X. Since the objects of V^C are functors, when I say in the last sentence of Definition 9.3.5 that Rep(C,V) "consists of" the representable functors, I am specifying the objects of the subcategory. Saying it is a full subcategory says that the morphisms between two such functors in Rep(C,V) are defined to be all the morphisms they have between them in V^C. (Did you understand the definition of "full subcategory" when you asked this question?) ---------------------------------------------------------------------- Regarding the diagram on p.341, you ask whether it is usually easy to obtain a concrete form for G(A). There's no "usually"! It depends on the categories involved. (After all, these left adjoints are examples of universal constructions, and we saw in Chapter 3 that general universal constructions in groups, such as the description of groups presented by given generators and relations, range from very easy to very hard.) ---------------------------------------------------------------------- Concerning the begnning of section 9.6 (pp.346-347) you write: > ... Can we think of the multiplication in Monoid to be derived > from the comultiplication? In other words, can we think of > comultiplicaiton to come before multiplication? It will be easier to talk about the SL(n) example, because there the two varieties, CommRing^1 and Group, are different, so when I talk about "the ring operations" and "the group operations", you will know whether I mean the operations of the domain or the codomain of the representable functor. In the SL(n) construction, the varieties CommRing^1 and Group are defined first -- the definitions of those varieties use only operations, as in all the preceding chapters of the book; no co-operations. Given those two varieties, one considers a functor from the first to the second. This is a construction that takes for input any commutative ring A, and produces for output a group SL(n,A). The operations of the group that we construct are defined using the operations of the ring, and the way that this is done is encoded in the co-operation on the representing object. Our investigation of representable functors Monoid --> Monoid is similar, except that instead of being given a known construction like SL(n), and finding a way of encoding it using co-operations, we are investigating all possible functors Monoid --> Monoid that can be encoded in this way, i.e., that are representable. > ... Even with the example of SL(n), I am not grasping > what is meant by representing multiplication. ... Well, let's go back a few steps. Make sure that you understand the following points: --> If V: C --> Set is a representable functor, with representing object R, then for any object A of C, elements of V(A) correspond to homomorphisms R --> A. --> In the above situation, R has a universal element of V of it -- an element of V(R) that can be sent to each element of each set V(A) by a unique morphism R --> A. --> In the above situation, R \coprod R likewise has a universal ordered pair of elements of V of it. We then see that: --> If there is a binary operation which we can define in a functorial way on the objects V(A) (A\in Ob(C)), then by applying it to the universal ordered pair of elements of V(R\coprod R), we get what can be considered a universal instance of that operation. It will be an element of V(R\coprod R), hence will correspond to a morphism R --> R\coprod R. That is the co-operation. Using the universal property of R\coprod R, it determines the operation on all objects V(A). - - - The question of "which comes first" really depends on the situation one chooses to look at. In the SL(n) situation, we first knew how to multiply such matrices, and then translated this into a co-operation. In the present study of functors Monoid --> Monoid, we don't know, at the outset, what representable functors exist, so we are starting with the properties that a representing monoid and a comultiplication and co-neutral-element must have if it is to determine such a functor. In any case, note that the operations of the representing object R must be defined before we can speak of co-operations, since a co-operation is a map to a coproduct, and the structure of a coproduct of copies of R depends on the algebra structure of R. - - - Unless this clears the problem up completely, I suggest coming to office hours to discuss it. ---------------------------------------------------------------------- Regarding Theorem 9.6.20 on p.352 you ask, > How do we know that multiple E-systems don't correspond to the same > representable functor from monoids to monoids? That's implied by Exercise 9.6:2. The unit of the adjunction is the map taking each E-system X to PQ(X). If two nonisomorphic E-systems X and X' had Q(X) isomorphic to Q(X'), then PQ(X) would be isomorphic to PQ(X'); but the by exercise, they are isomorphic to X and X' respectively, hence not to each other. Intuitively, the result says that one can recover the structure of X from P(X), namely by applying the functor Q. (Theorem 9.6.20 is not very clearly stated; I have a notation to rewrite it.) ---------------------------------------------------------------------- Regarding the last line of Theorem 9.6.20, p.352, you ask > ... What is meant by the word "equivalence" ... See Definition 6.9.4, p.206. > ... and why isn't it italicized? In an italic passage, de-italicization is used to show emphasis, just as italicization is used in non-italic text. I'm not entirely happy with that convention, since to my eyes, de-italicization doesn't make a word stand out the way italicization does, and doesn't give the same "feeling" of emphasis. But the convention is standard, and I follow it. I am emphasizing the word "equivalence" in this theorem because it conveys the "punch" of the result: that E-systems give all the information one could ask for about representable functors from monoids to monoids: What the distinct structures are, and how they can be mapped to each other. ---------------------------------------------------------------------- Regarding the first two diagrams on p.353, you write > ... I don't see how it is decided that the E-system represented by > the first boxes represents the identity functor and the second boxes > represents the opposite monoid functor. Why can't the first one > represent the opposite monoid functor and the second one the identity > functor? At the beginning of the next-to-last line on p. 352, note the word "respectively". Make sure you understand what it means, and what consequences it has for the two coalgebras you ask about. Then look at the functors those two coalgebras represent. If you have trouble at some point in this path I've outlined, tell me where, and I'll help you from there. > I have a problem with seeing what difference first coalgebra having > the comultiplication m(x)=x^{rho} x^{lambda} for all x, and the > next one having the comultiplication m(x)=x^{lambda}x^{rho} > for all x, make to the respective functors. ... It's not "for all x"! It's for the one element x\in ||R|| that has degree 2! The other elements x^n have higher degree, and are mapped by the comultiplication to correspondingly more complicated expressions. > ... Do you mind pointing out where I should re-read to understand > this? I guess the best place to start is Definition 9.3.1, p.337. The second paragraph of that definition describes the operations on the functor represented by a coalgebra object. It begins by saying that these are induced "under the dual of the construction of the preceding section", but then gives an explicit description of that construction. This Definition is written from the point of view of going from the co-operation to the operation, but that should not be a problem: Take the case where |R| is the free monoid on one generator x, note how to identify the set-valued functor represented by |R| with the underlying-set functor on monoids, form the coproduct of two copies of |R|, call the generator of the first "x^\lambda" and the generator of the second "x^\rho", then consider two different co-operations |R| --> |R|^\lambda \coprod |R|^\rho, namely, m_1 taking x to x^\lambda x^\rho, and m_2 taking x to x^\rho x^\lambda. (And note, as I emphasized above, that these do not take every element r of |R| to r^\lambda r^\rho or r^\rho r^\lambda. You should see what they do to other elements.) Then apply Definition 9.3.1 to find the binary operations on the underlying-set functor Monoid -> Set induced by those two co-operations. Hopefully, you will find that one of them is the original operation of the monoids to which the functor is applied, while the other is the opposite multiplication. If you get stuck, come to office hours and go through it with me. (Or if the problem is one that can easily be described in e-mail, you can e-mail it to me.) Once you see these examples, you will, hopefully, understand how, given an operation on a representable set-valued functor, one can go the other way, and find the co-operation on the representing algebra that induces it. Let me know how this goes. If I can identify the roadblocks that keep students from understanding this material, I can hope to get it across better in the future! ---------------------------------------------------------------------- Regarding the results of section 9.6, summarized on p.353, you ask, > ... We just described representable functors from MONOID > into itself. Does this shed any light on representable functors > from RING^1 into itself? (I'm asking because rings are monoids > with extra structure.) The problem with that approach is that determining representable functors from Monoid to Monoid comes down to saying "These are the only ways one can obtain an associative operation on (appropriate sorts of) tuples of elements of a monoid, using the monoid operation alone"; but when we are looking at functors on Ring^1, we aren't limited to using "the monoid structure alone". Generally, if we are looking at functors out of a given variety W, then restrictions on the functors we can get into one variety V will also give us information about restrictions on functors from W to other varieties V' that in some sense "have a V-structure and more", but won't give restrictions on functors from varieties W' that have "W-structure and more"; inversely, existence results for representable functors V --> W will give existence results on such functors V' --> W when V' has "a V-structure and more", but will not give existence results on functors V --> W' where W' has "a W-structure and more". A description of all representable functors Ring^1 --> Ring^1 is, however, obtained in my book with my first PhD student, Adam Hausknecht, reference [2]. ---------------------------------------------------------------------- You ask about a generalization of Exercise 9.7:1(v) (p.355). I hope you would want to include (iii) and (iv) along with (v), since they all show the same pattern. In very general form, the pattern is that when one has a retraction of a mathematical object X to an object Y, meaning a map f: X --> Y which has a right inverse g: Y --> X, i.e., such that fg is the identity morphism of Y, though gf may not be the identity of X (see p. 189, paragraph containing (6.7.3) -- then maps from any object Z to Y correspond to maps to h: Z --> X such that h = gfh; and likewise maps Y --> Z correspond to maps i: X --> Z such that i = igf. Namely, the map h: Z --> X corresponds to fh: Z --> Y, and the map i: X --> Z corresponds to ig: Y --> Z. You can verify that this gives bijections of sets of maps in each case. The situations occurring in the exercise are a little more complicated in that the composites FU and UG are not quite the identity functors of the categories in question, but isomorphic to the identity functors. So one has to make a little adjustment (noted parenthetically at the end of 9.7:1(iii), and then taken for granted in the remaining parts). ---------------------------------------------------------------------- You ask whether our classification of representable functions K-Mod --> L-Mod on pp.357-361 requires that K and L have unity. The classification could be carried out either with or without that assumption. (In the latter case, of course, we would leave out (9.8.6) and (9.8.11).) But the category of modules over an object K of the category Ring is equivalent to the category of (unital) modules over the object K^1 of Ring^1, where K^1 is the ring whose underlying additive group is the direct sum of the additive groups of Z and of K, and whose multiplication is defined in a way that uses the multiplication of K on pairs of elements of K, and makes the 1 of Z the multiplicative neutral element. Since rings of the form K^1 are far from all rings (e.g., no field has that form), the categories we get by considering nonunital base ring are more restricted than those we get by studying unital modules over unital base rings. So it seems best to study the unital case, and obtain results for nonunital rings, whenever one needs them, as corollaries gotten by applying the unital results over the rings K^1, L^1. ---------------------------------------------------------------------- You ask whether the concept of tensor product was motivated by the considerations on pp.361-363. I think that the tensor product construction originated in physics, where it was realized that the concept of "tension" in a solid -- the expression for the forces acting to stretch and compress the material -- could be expressed as a member of a vector space, and that this space had more dimensions than the 3-dimensional space R^3 in which the material lived, but that it was closely connected to R^3, since every rotation of an object in R^3 induced a corresponding transformation on the expression for tension. It was finally worked out that it was a space generated by the image of a bilinear map R^3 x R^3 --> V, and various spaces with such multilinear maps were called "spaces of tensors"; and eventually a universal space S with a multilinear map V x W --> S was called "the tensor product of V and W". (Originally, for vector spaces V and W; later for more general modules and bimodules.) This is just the impression I've picked up; I've never studied the history of the subject. But I'm sure that the relation with composition of representable functors was realized much much later. ---------------------------------------------------------------------- Regarding the constructions C^pt and C^aug (Def. 9.10.1, p.366), you write > For a given category, we need not have a way to turn any object into > a pointed object, as there need not be morphisms from the terminal > object to every object, but all objects can be made augmented, right? Nope. First, if we have an object X of C that doesn't admit a morphism from the terminal object, then the corresponding object of C^op won't admit a morphism to the initial object (since a morphism from it to the initial object in C^op is just a morphism from the terminal object of C to it). But we don't have to go to such cases to get examples. In the category CommRing^1, no ring that contains a field can be augmented. (I.e., it can't have a homomorphism to Z.) ---------------------------------------------------------------------- You ask about "the intuition for deriving representable functors $V^op --> W$ from $V \bigcirc W$" (p.378). The idea is that given an object $R$ of $V \bigcirc W$", it has both a $V$ structure and a $W$ structure. Because of the former, one can associate to every object $A$ of $V$ the set of homomorphism $V(A,R)$, and because of the latter, one can apply any n-ary operation of $W$ pointwise to these V-homomorphisms. Finally, the "commutativity" relations have the consequence that the result of applying a W-operation pointwise to a tuple of V-homomorphisms is again a V-homomorphism, making our set of V-homomorphisms a W-object. As noted in class, duality of vector spaces is an example (with V=W). ----------------------------------------------------------------------