ANSWERS TO SOME QUESTIONS ASKED BY STUDENTS in Math 245, taught
from my notes "An Invitation to General Algebra and Universal
Constructions", http://math.berkeley.edu/~gbergman/245,
Spring 2008, Fall 2011, Spring 2014, Fall 2015, and Fall 2017.
These are my responses to questions submitted by students in
the course, the answers to which I did not work into my lecture,
but which I thought might be of interest to more students than
the ones who asked them.
(Responses to questions submitted before 2015 have been adjusted
to the numbering of the 2015 version of the text. Since the published
version and the online preprint version differ in pagination, I refer
to results by number but not by page. All chapter-numbers have
increased by 1 since the pre-2015 versions of the text, because
the software used in the published version does not allow a text
to start with "Chapter 0", as this text previously did.)
----------------------------------------------------------------------
You ask whether, in Exercise 2.2:2, when I speak of groups
yielding the same pair, G'_1 = G'_2, I mean isomorphic pairs.
No; by equality I mean equality! If you have proved a result
about isomorphism, you can submit that as homework -- investigations
of questions of one's own devising are accepted as homework, if
they are relevant to the course -- but the question asked was
about genuine equality.
----------------------------------------------------------------------
Regarding the construction of group-theoretic terms in section 2.5,
you ask whether we need to consider $X$ of uncountable cardinality.
That depends on what purpose we will be using our terms for. If we
want to use them to write down identities, then the countable case is
enough. (We'll show that explicitly, toward the end of the course,
for arbitrary sorts of algebras. It is the second paragraph of
Lemma 9.4.2, in the case where \gamma = \aleph_0.)
On the other hand, if we have some explicit uncountable group G
(for example, the group of all bijections from a countable set
S to itself), and we want to reason about all relations satisfied
by an uncountable set X of elements of G, then we need to use
terms in that uncountable set.
----------------------------------------------------------------------
You ask why, in the middle of the paragraph following Exercise 2.5:2,
I say that \iota_T(x.y) would be x.y^{-1}, rather than (x.y)^{-1}.
I say earlier in the paragraph that "We need to be careful".
This example illustrates how a rule that one might naively
propose for defining terms and term operations as strings of
symbols and operations on those strings could go wrong. The rule
in question defines \iota_T to simply append the symbol ^{-1}
to whatever symbol one plugged into it; and as this example shows,
it would not have the properties one wants. In the next paragraph
I talk about using parentheses as part of one's string of symbols,
which does give what one wants. (It uses more parentheses than
the minimal number needed, but at least it works.)
----------------------------------------------------------------------
Regarding my statement in the next-to-last sentence of section 2.6,
that when one forms the set T of group-theoretic terms in \{x,y,z\},
the term "y" represents "the ternary second-component function",
you point out that as a set, \{x, y, z\} doesn't specify that
y is in "second position".
Good point. I agree that writing X = \{x, y, z\} in no way
puts an ordering on x, y, z. I guess I was thinking of the
symbols x, y and z as symbolically meaning "first variable",
"second variable" and "third variable", as though they were
written x_1, x_2 and x_3. I'll have to think about what
to do here in the next revision. Thanks for pointing it out!
----------------------------------------------------------------------
You ask whether the two downward arrows in the diagram in
Definition 3.1.3 represent the same map h, or whether there
is a distinction between the map regarded as a set map and as
a group homomorphism.
They do indeed represent the same map h. Since a group homomorphism
is a set-map that satisfies appropriate conditions with respect
to the group operations, the same map can be both a set-map and
a homomorphism. Its occurrence in the triangle is based on the fact
that it is a set map, and so can be composed with the horizontal arrow
of that triangle, and the result equated with the diagonal arrow,
giving us a triangle of set maps. Its appearance on the right,
on the other hand, is based on its being a group homomorphism.
It is unique for having the combination of the set-theoretic and
group-theoretic properties shown by these two appearances.
----------------------------------------------------------------------
You ask about the remark after the last diagram in section 3.1,
that since any two free groups on X are "essentially" the
same, one often speaks of "the" free group on X.
Whether it is reasonable to use "the" in such a situation depends on
what one is focusing on. If one is interested in the group-theoretic
properties of the group, these are the same for any two isomorphic
groups, so one thinks of all free groups on X as being "the same
for the purposes of discussion", and it is natural to use "the".
On the other hand, if one is interested in questions such as whether
the set X is actually contained in F, or mapped into it by a
map u which is not an inclusion map, then this will differ for
different realizations of the free group structure, and one would
call each of them "a free group on X".
Regardless of one's point of view, it is not quite precise to say
"the" free group, since the different objects with that universal
property are not literally the same. But it is OK to speak a
little imprecisely as long as the speaker and hearer understand
what is actually meant.
----------------------------------------------------------------------
Your pro-forma question was why conditions (3.2.4)
and (3.2.5) needed to be specified to be sure T/~ was a group;
specifically, why they didn't follow from the other relations.
The idea of your answer was right -- that instead of assuming
that "~" is a relation of a naturally occurring sort, which
one could expect to satisfy (3.2.4) and (3.2.5), one should think
in terms of coming up with an "artificial" relation, which would
have no reason to satisfy those conditions merely because it satisfies
the others.
One can, in fact, describe a concrete way of getting such "artificial"
relations. Start with a relation ~_0 which does come from a
map v of X into a group G; and then let ~ be an equivalence
relation containing ~_0, gotten by choosing two equivalence classes
[p] and [q] of ~_0 and "joining them into one" under the new
relation. You should find it easy to show that ~ will then satisfy
(3.2.1)-(3.2.3) and (3.2.6)-(3.2.8), but, assuming ~_0 had more than
just the two equivalence classes [p] and [q], that ~ will not satisfy
(3.2.4) so that (3.2.9)-(3.2.11) will not give well-defined operations
on G.
You also asked whether (3.2.4) (with the other conditions)
would at least imply (3.2.5). You thought it probably wouldn't;
but in fact it will. One can show this using the fact that from
(3.2.4) and the other conditions, T/~ acquires a structure with
a well-defined multiplication operation satisfying the consequences
of (3.2.1)-(3.2.3). It follows from (3.2.3) that for each element
[p] of that structure, [p^{-1}] will be an inverse, and then arguing
that inverses must be unique.
So what is the point of including (3.2.5)? To give a procedure
which, without such subtle arguments, shows the existence of free
groups, and which incidentally can be used with other sorts of
algebraic structures to which those subtle arguments might not
be applicable.
----------------------------------------------------------------------
You ask what is meant by a "predicate", where I say in the third
paragraph after display (3.2.8), that if we think of relations as
predicates, then the "intersection" operator on them becomes the
logical operator "and".
A "predicate" means, roughly, an assertion, with some set of blanks
to be filled in. In grammar, when one divides the sentence "The man
is happy" into "subject and predicate", one calls "the man" the
subject, and "is happy" the predicate; and that predicate can be
combined with other subjects to give other sentences. So "predicate"
is extended by logicians to mathematics, where conditions such as "="
and "isomorphic to", can be called binary predicates, "is positive"
(among real numbers) and "is prime" (among natural numbers) can be
called unary predicates, etc.. A relation can be formalized as a set
of pairs, or thought of as a predicate, and the notation varies
accordingly.
----------------------------------------------------------------------
You ask for references to how the set-theoretic difficulties with
"generalized operations" raised in the 2nd paragraph after
Exercise 3.3:5 can be resolved.
Actually, they can be resolved using the ideas of section 7.4 of
these notes. I just didn't want to point the reader in section 3.3
to something that he or she might find hard to follow at this point.
----------------------------------------------------------------------
Regarding the paragraph preceding Proposition 3.4.6, you ask how the
description of what s_v does to a implies that s \neq t in
T_{red} implies s_v(a)\neq t_v(a).
Well, the description shows that if s\in T_{red}, then s_v takes
a to the word consisting of precisely the string of symbols of s,
with parentheses removed, followed by a. Now if s and t are
different members of T, then on removing parentheses, they still have
different strings of symbols, since members of T_{red} all have
parentheses clustered to the right (see discussion leading to (3.4.1))
and "^{-1}" applied only to elements of X, not to longer parenthesized
expressions. Hence adding an a at the end, we get different words
s_v(a) and t_v(a).
Does this make sense now?
----------------------------------------------------------------------
Regarding the construction of the quotient group at the end of
section 4.1, you ask whether there is some other way of constructing
that group either "from above" or "from below" ...
When we find the normal subgroup N generated by a set S of
elements of G, we can do this either "from above" or "from below",
as discussed in that section. But once we have found it, all we
have to do to impose the corresponding relations on G is to
divide out by it. So the "from above / from below" distinction
comes in at an earlier point in this construction.
----------------------------------------------------------------------
You ask about the "striking properties" of the group introduced in
Exercise 4.3:12.
The most striking one involves the concept of a "left orderable"
group -- a group G that can be given a total ordering "\geq"
such that whenever two elements g,h\in G satisfy g\geq h, and
f is any element of G, then one also has fg\geq fh. It is
easy to show that such a group has no element of finite order
other than e, and for many years it was an open question whether
the converse was true. This group turned out to be a counterexample.
(I've recently learned that the person who discovered this fact,
David Promislow, had been running a computer program to test whether
the group had a certain property known to be necessary for a group to
be right orderable. The computer turned up a counterexample to that
property, and Promislow at first thought there must be an error in
his program; but he checked it and found it was right.)
----------------------------------------------------------------------
You ask what is meant by "normal form" in section 4.4 (2nd paragraph
after (4.4.3).)
Did you check out the phrase in the index, and re-read the paragraph
where it is defined (the boldface number in the index entry)?
----------------------------------------------------------------------
You ask why the commutator subgroup, referred to in section 4.4,
is also called the derived subgroup.
I don't know. It might simply be that early in the development of
group theory, it was the one important general way of constructing
a subgroup from a group, so it was given a very "basic" sort of
name. Or it might be that group theorists first started using the
symbol G' for this important construction, and then gave it the
name "derived" because in calculus, f' denotes the "derivative"
of f.
----------------------------------------------------------------------
You asked, in connection with exercise 4.4:7, "Is there any
reason not to consider a free solvable group on a set X?"
Well, that is essentially what the last sentence of the exercise
is asking _you_!
So you need to look at the construction of free groups and the
other universal objects introduced so far, and also look at the
concept of "solvable group", and see whether you can adapt the general
methods of constructing free objects to that concept. In trying to do
so, you need to look at the differences between the concepts of "group"
and of "solvable group", and see whether those difference pose
obstructions to one or another of the methods of constructing free
objects; if you find that one of those methods goes over with no
changes at all, point this out. If all of them run into difficulties,
note what those difficulties are, and see whether you can overcome
them in at least one case. If you can't, then try to prove a
non-existence result.
Have you attempted any of this? If so, what were your results?
----------------------------------------------------------------------
You ask about the reason for the term "residually finite",
introduced just before Exercise 4.5:2.
Well, in certain contexts, a factor object of a mathematical
object is said to consist of "residues". I guess this goes
back to the ring Z/nZ, where people often think of its set
of elements as \{0, 1, ..., n-1\}, i.e., the residues that
one gets on dividing arbitrary elements of Z by n. The
most common use of the term I am aware of in algebra is when
R is a local ring with maximal ideal p; then R/p is called
"the residue field" of R. Perhaps in group theory G/N was
at some time called the "residue group" of G on dividing by N.
Then if G can be thought of "living in" the direct product of its
finite residue groups, it is reasonable to call it "residually
finite".
Other words can be substituted for "finite". E.g., if the
elements of a group can be distinguished by homomorphisms
into solvable groups, then it is "residually solvable". I'm
so used to this use of "residually" that I hadn't even thought of
where it came from.
----------------------------------------------------------------------
Concerning the proof of Proposition 4.6.5, you say
> I don't quite see what the purpose of constructing this set A is ...
I hope that what I said in class answered your question. Though
it is easy to see that every element of the universal group generated
by images of G and H can be written in the form (4.6.4), and to
define operations on such expressions, it is not easy to show that
these operations satisfy the group axioms; equivalently, that the
group identities, together with the relations satisfies by our
generators, don't somehow imply equality among two different
expressions (4.6.4). But by finding a group of permutations of
a set on which all those relations are satisfied, and in which
different expressions (4.6.4) yield different permutations, we
get this conclusion, and hence have the desired description of
the "coproduct" of the groups G and H.
----------------------------------------------------------------------
You ask why there are different symbols for direct sums and direct
product of abelian groups, even though, as noted in the 5th paragraph
of section 4.7, they are the same construction.
I think it's mainly historical. The symbol A (+) B developed
within group theory and module theory, as representing an abelian
group or module in which every element was uniquely representable as
a sum a + b with a\in A, b\in B. The symbol A x B developed
within set theory, probably based on the fact that when A and B
are finite sets, with A having m elements and B having n
elements, then A x B is a set with m n elements. The concept
"A x B" spread to all areas of mathematics, since the concept of
direct product is important everywhere (e.g., if R is the real
line, then R x R is the plane). In particular, if for A and
B are two rings, or (not necessarily commutative) groups, then
A x B has a natural structure of ring or group. In afterthought,
one sees that when A and B are abelian groups, this construction
A x B is isomorphic to the construction A (+) B that people had
been using all along. So the result was two symbols for the same
construction.
But as mentioned in the notes, for infinite families, the
corresponding constructions are distinct; another reason for
using distinct symbols.
----------------------------------------------------------------------
You asked whether the map X --> F has to be an inclusion.
Please remember, in submitting future Questions of the Day, to specify
what point in the text your question refers to. I think you were
referring to the map from a set X to the underlying set of the
free monoid on X, introduced near the beginning of section 4.10.
In constructions of free algebraic objects, the map u: X --> F does
not have to be an inclusion. For instance, one of the constructions
of a free abelian group that we saw in section 4.4 took the underlying
set of the group to consist of integer-valued functions on the set
X, and the universal map carried each x\in X to the function
X --> Z which had the value 1 at x, and 0 at all other points.
The element x is certainly not the same as that function.
The map u will, however, in general be a one-to-one map (often
called an "injection".)
In our notation, we often use the same symbol, e.g., x_i, for
an element of x, and its image in our universal object, relying
for context to tell us when x_i itself is meant, and when u(x_i)
is meant. But that is simply a shorthand to avoid messy notation.
----------------------------------------------------------------------
You ask whether for monoids, as for groups, the free object on
a larger number of generators can be embedded in the free object
on a smaller number of generators.
Yes. Actually, for monoids examples like that are "easier" to come
by in one way, but "harder" in another, than for groups.
"Easier" in that within the free monoid on \{x,y\}, the countably
many elements xy^n (n=0,1,2,...) are free generators of a free
submonoid. This is not so for groups: if we abbreviate xy^n
to z_n, we find that (z_m)^{-1} z_n = (z_0)^{-1} z_{n-m}, so
the elements z_0, z_1, ... satisfy nontrivial group relations.
(However, there is also an easy example in free groups: the elements
w_n = y^{-n} x y^n generate a free subgroup.)
"Harder" because while it is known (though not that easy to prove)
that every subgroup of a free group is free on some set of generators,
this is not true in monoids. E.g., if x\in X and F is the free
monoid on X, then the elements x^m as m ranges over all
nonnegative integers other than 1 (or more generally, over all
nonnegative integers greater than some fixed m>0), form a submonoid
which is not free.
----------------------------------------------------------------------
You asked, regarding the connection between the Weyl algebra and
Quantum Mechanics mentioned after Exercise 4.12:3, whether I could
recommend any texts giving a mathematical treatment of Quantum
Mechanics.
I asked a few colleagues, and two books were suggested. One is
an old book, "Mathematical Foundations of Quantum Mechanics" by
George W. Mackey, now republished by Dover. (Dover republishes
old out-of-copyright mathematical works at low prices -- a valuable
service!) The other is "Introduction to Quantum Mechanics" by
Hannabuss. I also learned that we have a course in the mathematics
of quantum mechanics, Math 189.
----------------------------------------------------------------------
You asked about the treatment of Galois theory via tensor products,
mentioned after Exercise 4.13:4.
There are two write-ups of courses Lenstra has taught on the subject;
he describes them as "each having its own imperfections":
http://websites.math.leidenuniv.nl/algebra/topics.pdf
http://websites.math.leidenuniv.nl/algebra/Galoistheoryschemes.pdf
The first is notes from a 250A he taught here, written up by
him and two students. The second is from a course he gave long ago
in Leiden (written up by an unknown person, and found on the web),
which did the analog of Galois theory for rings more general than
fields.
He would be happy to learn of any errata you find in either of them.
You also mention the site
http://en.wikipedia.org/wiki/Grothendieck's_Galois_theory
Unfortunately, at the moment Wikipedia seems to be down, so I
can't check it out.
----------------------------------------------------------------------
You ask about the statement following Exercise 4.14:3 that the
idempotents in a commutative ring R correspond to the continuous
{0,1}-valued functions on its spectrum. I assume below that you are
familiar with basic algebraic geometry:
On the one hand, suppose r\in R is idempotent. Then 0 = r - r^2 =
r(1-r), so any prime ideal P contains either r or 1-r.
Assume P contains r. Then 1-r, being 1 -(member of P),
is not in P, so it is invertible mod P, so when one localizes
at P one can cancel it from "r(1-r)=0", getting r = 0 in the
localization. Likewise, if 1-r\in P, then in the localization,
1-r=0, i.e., r=1. So the continuous function on Spec R induced
by r is everywhere {0,1}-valued.
Inversely, if f is a continuous {0,1}-valued function on Spec R,
then the subsets of Spec R on which it is 0, respectively 1,
will be open-closed. Hence in the structure sheaf, the function
agreeing with the global section 0\in R on the first of these sets
and with 1\in R on the other, i.e., f itself, will be a global
section, i.e., a member of R.
----------------------------------------------------------------------
Regarding the construction of coproducts of sets in section 4.15,
you write
> Let's say we have S=(S_0,S_1) we are forming the coproduct Q. If a,b
> are in S_0, and if (a\cup b)=c, then we would expect the left
> injection, as a homomorphism, to satisfy ...
"Homomorphism" is a concept relevant to algebraic structures; it
means map between underlying sets that refers to the sort of algebra
in question. When we talk of a coproduct of sets, these are sets
without any additional structure; the analog of a homomorphism is
just a map of sets; so it is not expected to satisfy any other
conditions.
I guess you were thinking that a homomorphism of sets should respect
"the kinds of things one studies in set theory", which includes
things like the operation of taking unions of sets. That kind of
structure is what we studied in the section on Boolean algebras. But
the concept of a map of sets is simply that of a function, so the
coproduct of a family of sets is simply a universal instance of a
set with a function from each of those sets into it.
----------------------------------------------------------------------
You ask about the use of the phrase "generators and relations"
with respect to sets, following Exercise 4.15:1, asking how one can
"generate" anything when one is not given any algebra-operations.
We are carrying the phrase "generators and relations" over from
the cases of groups, monoids, rings, etc., in order to show the
parallelism of the constructions. But in the case of sets, the
situation is indeed degenerate, and nothing more is "generated"
than the given elements of X.
----------------------------------------------------------------------
Regarding the construction of the set obtained from a
set X by imposing relations given by R following Exercise 4.15:1,
you write
> ... the relation will make certain elements of X equivalent and
> then we just need to pick one of them as a representative. ...
That is an old-fashioned way of looking at these constructions. It
sometimes has advantages; but usually the nicer way is to let the
set of equivalence classes itself be one's new set . The
one "disadvantage" of that approach is the problem of visualizing
as a "collection of collections". The way I tell my
students in Math 113 to think of the result of "dividing" a set by
an equivalence relation is that such a set consists of new elements,
each of which arises by "gluing together" one or more elements of R.
No mathematical difference; but picturing them as "stuck together"
rather than "loose" may be less confusing.
One way or the other, the idea is to have a set having a many-to-one
relationship with X: different elements of X correspond to the same
element of the new set if and only if they are in the same class under
the equivalence relation.
----------------------------------------------------------------------
In connection with the examples at the beginning of section 4.17,
you ask (explaining that you have not yet taken a topology course) what
is meant by "closure".
In general, the closure of a subset S of a topological space
T means the set of points of T that are either in S, or are
limit points of members of S. When the topology comes from a
metric (distance function), something that you will have seen
in Math 1AB and 53 (even though not under the names "topology"
and "metric"), a limit point of S is simply a point which is the
limit (in the sense of those courses) of a convergent sequence of
points in S. So, for instance, when T is the real line, and S
is an open interval (a,b), its closure is the closed interval
[a,b]; while for the same T, and S the set of rational numbers,
the closure of S is the set of all real numbers. This description
of the closure is not valid for topological spaces that aren't
metric spaces, as noted in lines 4-6 after Exercise 4.17:1, but it
at least gives a start at picturing the concept.
> ... How does this not hold in the example (4.17.1)?
(4.17.1) consists of two examples, side-by-side; let's talk
about the first. It represents the image of the real line
under a continuous map into a compact rectangular region of
the plane. The closure of that image consists of the image
itself, the left endpoint of the wiggly line (which is not itself
in the image), and, on the right side -- well, you can see that
limits of sequences of points in that wiggly line will exist
all up and down a vertical interval. So there is no way to extend
the map R --> K to a continuous map R\cup\{+-\infty\} --> K:
the point +\infty would have to simultaneously be mapped to all
the points of that vertical interval.
The other example is similar.
----------------------------------------------------------------------
You ask whether, if we replace "compact Hausdorff" with "locally
compact Hausdorff" in the definition of the Stone-Cech compactification
(Definition 4.17.2), we get a "local Stone-Cech compactification"
construction, which turns Q into R.
Unfortunately, no. To see this, consider any irrational number
\alpha, and let f be the inclusion-map of Q into R - \{alpha\}.
The space R - \{\alpha\} is locally compact (every point of that
space has compact neighborhoods), but it has no point "where \alpha
should go", so the map f does not factor through the inclusion
of Q in R. Intuitively, this shows that in making Q locally
compact, there is no need to insert an element "where \alpha should
go". Since \alpha was an arbitrary irrational number, there is in
fact no extra point that "has to be inserted". Yet if we don't insert
any point, we don't get local compactness; so I don't think there is a
universal local-compactification.
For a little more intuition, let \beta be the cube root of 2.
Let h: Q --> Q be the function such that f(x)=x for x not
between \beta and \beta^2, but f(x) = 2/x for x in that
interval. This self-homeomorphism of Q turns that interval
upside-down, while leaving the rest of Q unchanged, showing that
the topology of Q is far from determining its order-structure.
Hence there can be no natural way to construct from the topological
space Q the space R, whose topology does almost determine its
order-structure. (It determines it up to reversal.)
However, if we regard Q as a metric space, then that metric space
certainly does determine the metric space R, namely, as its
completion; and the completion construction can indeed be regarded
as a universal construction on metric spaces.
----------------------------------------------------------------------
You ask whether Lemma 4.17.3 can be proved without the Hausdorffness
condition on K.
No. For example, let K be any set, given with the topology under
which the closed subsets are K itself and its finite subsets.
Then any infinite subset of K is dense. Hence if we let X be any
infinite set, we can map it into topological spaces K constructed
as above so as to have dense image, though there is no upper bound
on the cardinalities of such spaces K.
----------------------------------------------------------------------
Regarding the discussion in the middle of section 4.18, you write
> You note that for a space to possess a universal covering space
> it must in general be semi-locally simply connected. ...
No; I note that this and the other conditions listed are the
assumptions that are shown in [90] to be sufficient to make the
construction described work; I don't say that a universal covering
space can't exist if they don't hold. The author of [90] may even
have known of more general hypotheses under which a universal covering
space exists, but decided that it would be best to prove a result that
applied to most "naturally arising" spaces, rather than go through a
much messier argument to cover some additional pathological cases.
In particular, it looks to me as though some spaces that do not
satisfy the condition of being locally pathwise connected (which
is also in the list), can have universal covering spaces. E.g.,
consider the subset of the plane consisting of the union of the
line-segments y = cx with x\in [0,1], and c taking on
the value 0 and all values 1/n for positive integers n.
This X is contractible, hence simply connected, so it should be
its own universal covering space; but it is not locally pathwise
connected (that fails near any point (x,0) with x>0).
However, it is indeed hard to see how a space that is not semi-locally
simply connected could have a universal covering space. To see
why, let us assume for simplicity that this condition fails in the
neighborhood of the base-point x_0. Then there are non-contractible
loops that stay arbitrarily close to that point. Suppose
(p_i)_{i=1,2,...} is a sequence of such loops that stay closer and
closer to x_0. Then the sequence p_i will approach the trivial
loop, which stays at x_0. Hence by continuity of the map
p\mapsto \~{p}, their liftings \~{p_i} should be approaching
the constant base-point map in the universal covering space. But
since they are non-contractible, they should end up in different
"layers" of that covering space. If points in those different
layers can approach the basepoint, this contradicts the definition
of covering space, which forces the inverse image of every point
of X to be discrete.
However, for non-locally-pathwise-connected X, maybe it would
be most natural to modify the definition of covering space, e.g.,
by replacing "discrete" with "totally disconnected".
(Note: I haven't actually looked at algebraic topology since I
was a student in the '60's; so the above comments are not based
on reliable expertise.)
----------------------------------------------------------------------
Regarding Definition 5.1.3, you ask whether there is a reason
why we speak of "isotone maps" of partially ordered sets, rather
than "homomorphisms".
When we get to category theory (Chapter 6), we'll introduce the
general term "morphism", covering the various sorts of maps that
come up in different areas of mathematics: homomorphisms of
algebras, isotone maps of partially ordered sets, continuous
maps of topological spaces, etc.. Till then, we are using the
traditional terms; and "homomorphism" is traditionally used
for functions that respect operations on algebras. If we have
entities that mixes algebraic and non-algebraic structure, such
as topological groups, we may say "continuous homomorphism" or
"homomorphism as topological groups"; but one rarely uses
"homomorphism" when there are not some operations to be respected.
----------------------------------------------------------------------
In connection with the concept of an isotone, i.e., order-respecting
map, defined in Definition 5.1.3, you ask whether there is a term for
an order-reversing map.
Yes. It's called an "antitone" map. Not too commonly used, but
it exists.
----------------------------------------------------------------------
In connection with the paragraph before Definition 5.1.4, which
notes that partially ordered sets are not algebras in the sense
used in this class, you ask "Could we not get around this issue
by concentrating on relations instead of operations?"
Yes; but then objects such as groups, rings, etc. would have to
be described as having certain relations R subject to the
condition that for every x and y there exists a unique z
such that (x,y,z)\in R; and results on free groups, rings, etc.,
groups, rings, etc. presented by generators and relations, and so
on, would all have to be formulated in terms of relations with this
particular property. So to create a comfortable context for studying
these things, I have chosen to work with algebras, defined by
sets with operations, and to consider more general relational
structures to be nearby relatives which we visit when we need them.
----------------------------------------------------------------------
You suggest that the rule that associates the graph called the Hasse
diagram to a finite poset (two paragraphs before Definition 5.1.6)
can be used for arbitrary infinite posets P as well.
Unfortunately, the graph you get can lose a lot of information
from an infinite P. Think of what you get from the poset of real
(or rational) numbers!
Can you figure out under what conditions on P that diagram will
preserve all the order relations?
----------------------------------------------------------------------
You ask about the 1/3 and 2/3 in Fredman's Conjecture (Exercise 5.1:11).
These are forced on us by the 3-element partially ordered set
that has one pair of comparable elements, and one element
incomparable to both of these. (E.g., the set of integers {2,3,4}
under divisibility.) There are only three linearizations of
that ordering, so for any pair of incomparable elements, the ratio
of the linearizations that put one above the other to those that
put them in the reverse order can only be 1:2 or 2:1; so the
former have to constitute 1/3 or 2/3 of the linearizations.
One can get more examples where the best one can do is 1/3 or
2/3 from the above one; e.g., by throwing in a chain of elements
all lying above the three elements mentioned, and/or a chain of
elements all lying below them, or in more ingenious ways, such
as by putting one copy of the above poset on top of another.
But I would guess that if one excluded posets constructed
in these ways, then on those that remained, one could assert some
narrower interval around 1/2 than [1/3, 2/3]. (One might do
this exclusion by defining on each poset P the equivalence
relation generated by the condition of being incomparable under
the given ordering. Then the posets that would have to be
excluded would be those in which all equivalence classes had
cardinality 1 or 3, and the 3-element equivalence classes had
two elements comparable under the ordering.)
----------------------------------------------------------------------
You ask whether we can generalize the preorder "divides", referred
to in the first paragraph of section 5.2 as a relation on elements
of commutative rings, to noncommutative rings.
We can, once we decide what definition to use. One says that
x "right divides" y if we can write y = ax, and that it "left
divides" y if y = xb; each of these is a preorder; they
correspond to the inclusion relations on the left ideals Ry,
Rx, respectively the right ideals xR, yR, generated by our
elements. One could similarly call x an "interior divisor"
of y if y can be written axb, and this would also be
a preorder, though I have never seen this considered in ring
theory. (This relation is not equivalent to inclusion between
the 2-sided ideals RxR and RyR, for the reason mentioned
in the two sentences preceding Ex.5.12:2. The relation
of inclusion between RxR and RyR would yield still another
"divisibility-like" preorder on elements.)
----------------------------------------------------------------------
You ask about the motivation for the concept of Gelfand-Kirillov
dimension, developed in Exercises 5.2:2-9.
It originates in ring theory rather than monoid theory; cf. discussion
in the paragraph preceding Exercise 5.2:5. If one looks at
easy examples such as polynomial algebras k[x], k[x,y], etc., one
finds that the first grows linearly in i, the second quadratically,
etc.; and for more complicated structures, commutative or
noncommutative, one encounters similar patterns -- the dimension
tends to grow either as i^d for some d, or exponentially in i
(e.g., for a free associative algebra). GK(R) is defined so as to
"capture" the number d such that R grows like i^d if there
is one; it gives infinity if one has exponential growth. If one
tries, one finds that one can construct noncommutative rings
for which GK(R) is not an integer or infinity; but it still gives
some real number. Exercise 5.2:8 shows that the same growth rates
occur for algebras as for monoids, so in a text like this, which
assumes little ring-theoretic background, it is convenient to
devote most of the development to the monoid case.
----------------------------------------------------------------------
You note that the well-ordered sets (Definition 5.3.2) can be
characterized as the totally ordered sets all of whose reverse-well
ordered chains are finite, and ask whether there is a nice
characterization of of the totally ordered sets all of whose
reverse-well ordered chains are countable, noting that this class
includes the well-ordered set of real numbers.
I don't know. I wonder whether they are those that can be embedded
in a lexicographic product \alpha\times\mathbb{R}, where \alpha
is an ordinal, and \mathbb{R} is the ordered set of real numbers?
(Here one could replace \mathbb{R} by, say, the open or closed real
unit interval (0,1) or [0,1], since the former is order-isomorphic
to \mathbb{R}, while \mathbb{R} and [0,1] are mutually embeddable.)
----------------------------------------------------------------------
> ... You mention in the paragraph after Exercise 5.3:3,
> that some arguments for uniqueness of a differential equation
> solution use connectedness. ...
The simplest case is the equation y' = 0. On the real line,
the set of solutions is the set of constants, while on a domain
like (0,1) \cup (2,3), the solutions will be constant on each
connected component, but can have different values on the two
components, giving a 2-dimensional space of solutions instead of
a 1-dimensional space. From this, one gets similar behavior for
equations y' = f(x) for any continuous f(x); and one gets
analogous results for higher order equations; though, depending
on the nature of the equations, there may or may not be complications
in the existence and uniqueness results other than those resulting
from non-connectedness.
But differential equations are far from my field; so I'm
definitely no source of expert knowledge on the subject, or
on how the experts look at it.
----------------------------------------------------------------------
Regarding transfinite induction (a term not used in the notes), you
ask whether this refers to induction, in the sense of Lemma 5.3.4
over some ordinal.
Right.
> Also, I would enjoy seeing an example of some proposition that can be
> proven via general or transfinite induction, but not via standard
> induction on N.
Exercise 5.3:5 proves a standard result on symmetric polynomials
by induction over a well-ordered set which is isomorphic to an
ordinal > \omega, so it can be thought of as a transfinite
induction. There will be more examples in section 5.5, where we
will study ordinals. (The definitions of ordinal arithmetic
in (5.5.7-9) and (5.5.10) are by transfinite recursion, and so
one proves results about these by transfinite induction; it is
also used in proving Lemma 5.5.12(ii).) In the next chapter, it
is used in proving Lemma 6.2.1; and still later, in section 9.2.
----------------------------------------------------------------------
You ask about my statement at the end of the 4th paragraph before
Corollary 5.3.6 that because of the axiom of regularity, we can make
set-theoretic constructions recursively, and whether without that
axiom, we have to use methods other than recursion.
What I meant was, "by recursion with respect to the membership
relation", since regularity shows that this has DCC. Without
regularity, one can still do recursion with respect to any index
set having DCC, and this is still an important tool in set theory.
----------------------------------------------------------------------
In connection with the statement (three paragraphs after display
(5.4.1)) that there is no easy way to extend the definition of an
ordered pair as the set \{\{X\},\{X,Y\}\} to a definition
of ordered n-tuple for larger n, you ask why we can't just define
the ordered n-tuple (x_1,...,x_n) recursively as the ordered pair
((x_1,...,x_{n-1}),x_n) when n > 2.
Nice! I hadn't thought of that.
However, there are still a couple of advantages in to the approach
described in the text. On the one hand, it has the useful property
that if an m=tuple (x_1,...,x_m) and an n-tuple (y_1,...,y_n) are
equal, then m=n and x_i=y_i for all i; while the approach you
suggested would have every n-tuple also being an m-tuple for all m How do we formalize the notion of a construction that may not
> necessarily be a function within ZFC? ...
We formalize it as a rule which describes, in every case, what the
value should be. This is like the situation to which the Axiom of
Replacement is applied.
> If the set of functions F is fixed, it seems like we should be able
> to treat r as a function from a subset of X x F -> {r(x), f_{ \aleph_\alpha
is not itself a function in our set theory.
----------------------------------------------------------------------
You ask about the significance of singular cardinals (Def. 5.5.18).
When dealing with a cardinal \kappa, one can usually say that the
union of a family of <\kappa sets, each of cardinality <\kappa,
is itself of cardinality <\kappa. (For instance, taking \kappa =
\aleph_0, this is the assertion that a finite union of finite sets
is finite.) The singular cardinals \kappa are precisely the
exceptions. Fortunately they are, as I mention, sparse, so
if one wants to use that principal in the proof of a theorem, one can
throw in the hypothesis "Let \kappa be a regular cardinal" and one
hasn't lost much.
As an example, if you look at p.2 of my paper with Shelah at
http://math.berkeley.edu/~gbergman/papers/Sym_Omega:2.ps ,
you will see, among the people we thank, Peter Biryukov. We don't
say there what we thank him for. It was he who pointed out to us that
the assertion we make in the paragraph at the bottom of p.3 was not true
as we originally formulated it, without the condition of regularity.
----------------------------------------------------------------------
In connection with Theorem 5.6.2, you ask whether, since
every set is in bijective correspondence with an ordinal, this
can be described as a universal property of ordinals.
Not in any way that I can see. The word "universal" has various
uses in math; for instance, "\forall" and "\exists" are respectively
called the "universal" and the "existential" quantifiers; so there
are very likely some statements using the word "universal" that follow
from the above fact about ordinals. But what are called "universal
properties" involve existence of unique maps.
----------------------------------------------------------------------
In connection with the suggestion preceding Exercise 5.6:2, that
if you haven't seen proofs by Zorn's Lemma before you might look
at such proofs in standard graduate algebra texts, or ask your
instructor for some elementary examples, you ask for such examples.
Most basic graduate-level algebra texts have many such proofs, but it
takes some work to find where they are. One text where one can find
these by an online search is Hungerford's "Algebra". If you go to
https://books.google.com/books?id=e-YlBQAAQBAJ and type "zorn" in
the box saying "Search inside", you will get 23 results, and turning
to the corresponding pages in a physical copy of the text, you can
find the proofs in question.
However, I suspect you can do *most* of Exercises 5.6:2-5.6:14
without looking at additional examples. It is unfortunate that I
began that string of exercises with 5.6:2, which, in its present form,
does not make it at all obvious how Zorn's Lemma would be used. So
look at the next few, and see whether you have better luck.
----------------------------------------------------------------------
You ask about the intuitionists' rejection of the Law of the Excluded
Middle, i.e., the assertion that if a statement has neither a proof
nor a counterexample, it will be neither true nor false (mentioned in
section 5.7); and you ask whether there are in fact any such statements.
I know of all this only from hearsay; I haven't studied the history.
But I would think that, first of all, they would have objected to the
application of the Law of the Excluded Middle to a statement for which
no proof or counterexample was *known*: if we don't know a proof or
counterexample, and can't prove that one of these exists, then they
would insist that we can't say that the statement must be true or false.
And there are plenty of things for which we don't know a proof or a
counterexample.
(As mentioned in a related context in Wednesday's class, I think
that their attitude came out of "logical positivism", which says
that a statement is only meaningful if one can describe a test that
will determine whether it is true; which I think was a reaction
against philosophers' use of vague terms without giving satisfactory
definitions of them.)
As for whether there in fact exist statements that can neither be
proved nor disproved by a counterexample -- Goedel's Incompleteness
Theorem shows, roughly, that given any sufficiently strong mathematical
language and any set of precise rules for reasoning about the
statements in such a language, there will be statements that can
neither be proved nor disproved. But typically, these statements are
equivalent to assertions "For all integers n, a certain computation
gives a certain result"; so if they can't be disproved,then in
particular, they can't have counterexamples, so the proof of Goedel's
result proves that they are true, even though that can't be proved
using whatever precise rules of reasoning were assumed given.
Whether similar results have been obtained that don't, in this
way, show that a statement is "really" true or false, I don't know.
But I certainly think it is plausible that there are mathematical
statements whose truth or falsity can't be established in any way.
----------------------------------------------------------------------
Regarding section 5.7, you write,
> ... You say anything proved within our systems may model
> the real world ...
What I meant is that if we set up a mathematical model of some aspect
of the real world, say in terms of differential equations, and we
ask a question about how that model behaves, and answer it with the
help of the Axiom of Choice, then assuming the question is equivalent
to the question of how some numerical computations come out (say
computations that approximate the differential equation more and more
closely using finite-difference equations), then what we deduce using
the Axiom of Choice must be consistent with the results of our
computations, and so must represent the behavior of the real world
with as much accuracy as our model does.
> My question is do you know any examples where it is "convenient" to
> accept AC ...
I think the statement that every vector space has a basis is such
a result -- it allows us to picture "exactly" what all linear maps
between two vector spaces look like. For instance, we can say that
given vector spaces V and W, any homomorphism f from a subspace
of V into W can be extended to a homomorphism from all of V
into W. The existence of such extending homomorphisms may themselves
not be useful facts about the real world, but it shows us that knowing
that such a homomorphism f can be extended, there is no point in
looking for restrictions that this implies on f; and it is useful
to know what not to waste our time on. (In contrast, the statement
that, say, an additive homomorphism of abelian groups f: 2\Z --> \Z
can be extended to all of \Z does restrict f: such an f must have
image consisting of even integers, though a general map 2\Z --> \Z
need not.)
> ... and also do you think it would be "inconvenient" to work
> in a system either without AC or one with the negation of AC?
Yes.
----------------------------------------------------------------------
You ask about the relation between the concept of lattice defined
in section 6.1, and the use of that term that you were familiar with,
for a discrete subgroup of R^n.
I think that the concept you are referring to and the concept
defined in section 6.1 are each named based on the way a picture
representing the mathematical structure looks. The Oxford English
Dictionary's first definition of lattice is "A structure made of
laths, or of wood or metal crossed and fastened together, with
open spaces left between". If you look at some of the pictures at
https://en.wikipedia.org/wiki/Lattice_(order)#Examples , you'll see
how these resemble such a structure. On the other hand, although a
picture of the concept you described might not include line-segments,
it is suggestive of the real-world lattices that people build in its
repeating regularity; this is reflected in the OED's definition 4.a
for "lattice": "Any regular arrangement of points or point-like
entities that fills a space, area, or line; spec. a crystal lattice
or a space lattice; ...".
(Russian has two competing words for the concept defined in
section 6.1: "struktura", which simply means "structure", and so
is ambiguous, and "reshotka", meaning "sieve", which has the
above pictorial quality. I don't know what Russian uses for the
concept you described.)
----------------------------------------------------------------------
You ask why I emphasize "pointwise" in the second paragraph after
Definition 6.1.4.
Hard to remember exactly what was in my mind when I put in that
emphasis. I guess the idea was that in a given set S of functions,
two functions f and g may have a least upper bound, i..e,
a least member h of S that is everywhere \geq f and \geq g,
without our being able to say much about this h; and that the reader
might carelessly think that this is all that "the maximum of f
and g" referred to if they missed the word "pointwise". I tend to
be a sloppy reader in such ways, and assume my audience is also
likely to be. And in general, when a concept that hasn't come up
previously in what one is doing is introduced, it is useful to bring
it to the reader's attention, and not let it get passed over unnoticed.
----------------------------------------------------------------------
You ask why, as mentioned in the 2nd paragraph before Exercise 6.1:2,
some people write lattice operations using the symbols for addition
and multiplication.
I'm not sure, but I can make several guesses. I don't know when
the symbols \vee and \wedge were introduced; there may have
been a time when they were not common, and people simply tried
to choose existing symbols with the closest meanings. "+" is natural
for "putting things together"; moreover, for unions of subsets of
a set, if we think of those subsets as represented by {0,1}-valued
functions on the set, the union can be thought of as "addition as
integers, with 1 made a ceiling"; while in most natural contexts,
meets are intersections, which correspond to products of {0,1}-valued
functions. Even after \vee and \wedge had been introduced, some
people may have simply stuck with the symbols they had learned first.
Also, for a long time we didn't have computers on which to compose
mathematics, and typewriters generally didn't have special symbols,
but did have +, while "xy" didn't require any symbol. Finally,
some people may use "arithmetic" symbols because they feel it
valuable to stress the analogy between lattices and rings; one
can speak of "ideals" in a lattice, for instance (subsets closed under
internal joins and meets with arbitrary elements). Even though these
don't play the role of determining the structure of the image of
homomorphisms, as they do in rings, they have some uses.
----------------------------------------------------------------------
You ask why 0 and 1 are used for the least upper bound and the
greatest lower bound of the empty family (as indicated in section
6.2) rather than "some other notation (possibly involving $\infty$)".
In the lattice of subsets of a set, looked at as \{0,1\}-valued
functions, the empty set is the constant function 0, and the
total set is the constant function 1. Also, just as in rings,
0 and 1 are the neutral elements for + and ., so in lattices,
a least and a greatest element will be neutral elements for
\vee and \wedge.
----------------------------------------------------------------------
You ask, in connection with the symbols "0" and "1" for the least
and greatest element in a lattice having these, noted in section 6.2,
whether this is related to the fact that in a ring, 0 and 1 generate
the smallest and largest ideals.
The use of "0" and "1" is certainly related to the ring-theoretic and
ideal-theoretic analogies; in particular, the case of Boolean rings;
and that case is in turn related to the fact that in the set 2^X of
subsets of a set X, identified with their characteristic functions,
the empty set and the whole set correspond to the constant functions
0 and 1.
----------------------------------------------------------------------
You ask why the fact noted following Exercise 6.2:5 that infinite
meets and joins in a complete lattice are not operations of a fixed
arity is a problem for complete lattices and not for <\alpha-complete
lattices in general.
For <\alpha-complete lattices, one can regard it as a problem, but a
problem with an easy solution, noted in the middle of the paragraph
in question: Regard such objects as having a set of "meet" and "join"
operations, one for each arity <\alpha. The difference in the case
of unrestricted complete lattices is that the resulting system of
operations will not form a set, since there is not a set of all
cardinals.
Given a particular such complete lattice L, one can argue that
one doesn't "need" operations of arities bigger than card(|L|). But
in many situations one isn't _given_ L, one wants to construct
an L, or say whether there is an L with a particular property,
so one can't know in advance a certain set of operations that will
be sufficient. And this is exactly what goes wrong in the matter
that paragraph refers us to, Exercise 8.10:6(iii).
You also suggest regarding meet and join as unary operations on P(|L|).
Well, the theory of sets with two unary operations would then be
applicable to that structure; but that isn't the same as the theory
of L as an algebra.
----------------------------------------------------------------------
You ask why, in the paragraph preceding Exercise 6.2:12, I attach
importance to \omega^X being a full direct product.
If we try to extend the result of the preceding exercise to
non-complete lattices L, we find that we cannot in general map any
lattice of the form P(X), i.e., 2^X, onto it by a complete upper
semilattice homomorphism. But we could if we allowed ourselves to use
for our domains complete upper subsemilattices of P(X). (Just embed
L in a complete lattice L', find a map f of some P(X) onto L' as
in the preceding exercise, and then note that f^{-1}(L) is a complete
upper subsemilattices of P(X) which f maps onto L.) So getting a
surjective map on a general subsemilattice of a direct product is
easier than getting such a map on a full direct product; and my comment
notes that we are not taking that easy way out, here.
----------------------------------------------------------------------
You ask whether there is a characterization of "cocompact" elements
(sentence after Exercise 6.2:15) in the lattice of subgroups of a
group.
Well, one context where such a concept comes up is in module
theory. A nonzero module is called simple if it has no proper
nonzero submodule, and the submodule of a module M generated
by all its simple submodules is called the "socle" of M. One
can show that the zero submodule of M is cocompact in the
lattice of all submodules of M if and only if every submodule
of M contains a simple submodule, and the socle of M is
finitely generated. In particular, looking at Z-modules, i.e.,
abelian groups, it follows that the zero subgroup of an abelian
group A is cocompact in the lattice of all subgroups if and only
if every element of A has finite order, and A has only finitely
many elements of prime order. I believe the above fact on modules
over a general ring is somewhat useful in module theory.
But I don't know a criterion for a general submodule N of a module
M to be cocompact in the submodule lattice. One cannot say that this
will hold if the zero submodule of M/N is cocompact in the lattice
of submodules of M/N. For instance, let A be the abelian group
of exponent p which is free as a Z/pZ-module on a basis
x_0, x_1, ..., x_n, ...
and B the submodule of A consisting of the elements in which the
sum of the coefficients of the above basis elements is 0. Then
A/B =~ Z/pZ, so the zero element in the submodule lattice of
A/B is certainly cocompact. But B is not cocompact in the
submodule lattice of A. To see this, look at the submodules
A = A_0 > A_1 > A_2 > ... where A_i is generated by all basis
elements x_j with j\geq i. We see that none of the A_i contains
B, but their intersection does, proving noncocompactness of B.
I haven't thought about the corresponding questions for nonabelian
groups.
----------------------------------------------------------------------
> In Definition 6.3.2 you say the "class of subsets". What does
> this mean?
It means "set of subsets". In general, "class" has a wider meaning
than "set", as discussed in the next-to-last paragraph of section 5.4
(i.e., the next-to-last of the paragraphs preceding Exercise 5.4:1).
But as noted in the paragraph after that one, it is also used in
contexts where the only relevant classes are sets, so in these cases
it means the same as "set".
----------------------------------------------------------------------
In connection with the concept of a ring with involution (mentioned
in the paragraph before display (6.4.1)), you ask about the etymology
of the word "involution".
Never thought of that!
I looked it up in the OED. It seems that "involution" is a noun
from the root of "to involve", and most of the nonmathematical
meanings that it gives have the idea of entanglement. It gives
three mathematical meanings: An old one, which apparently meant
raising a number to a power; a second one, which I was familiar with as
a map of the plane into itself which in appropriate polar coordinates
has the form (r,\theta) |-> (r^{-1},\theta) (though they only
refer to functions of the line into itself), and finally "A function
or transformation that is equal to its inverse."
The OED is not big on explaining how meanings come from each
other. My guess is that "raising to a power" arose from the idea
of a number being "entangled with itself", and is unrelated to
the other two (though the geometric sense could be somehow related
to r^{-1} being a power of r, or, if one reverses the sign of
\theta, to taking inverses in the complex plane). I think that the
(r,\theta) |-> (r^{-1},\theta) sense might have come from a
biological sense that they show, "A rolling, curling, or turning
inwards" on the part of an organ. This fits with the etymology of
"in+volu-" = "in-turning". Something that "turns inwards" often
turned inside out, and if "involution" came to have that meaning,
it would easily fit (r,\theta) |-> (r^{-1},\theta). Then this could
have been generalized to any function of order 2, giving their final
sense; and then specialized within ring theory to an order-two
map with the property I state.
(By the way, the map (r,\theta) |-> (r^{-1},\theta) has an interesting
geometric property: it takes \{lines and circles\} to itself. And
on points with rational values of r, it preserves the property of
having rational distances from each other (by an easy observation
involving similar triangles). So it is a useful tool in studying
families of points with rational distances among them.)
----------------------------------------------------------------------
You ask how the exchange axiom, (6.4.1) is used in showing that bases
of a vector space all have the same number of elements.
More precisely, that axiom is used when we know that one of
the bases is finite. (An entirely different method is used
when both are infinite, which calls on the fact that all
vector-space operations are finitary.)
The idea is as follows: Suppose B_1 and B_2 are bases
of V, with B_1 finite. We do induction on the number of
elements belonging to B_1 but not to B_2. If that number
is 0, we are done, since one basis can't be properly contained
in another. If not, let z\in B_1 not be a member of B_2,
and let X = B_1 - \{z\}. Since X does not span V, its span
(closure) must miss some y\in B_2. Applying the exchange condition
to this X, y and z, one can deduce that (B_1 - \{z\}) \cup \{y\}
is again a basis of V; and it has the same number of elements as
B_1, but more elements in common with B_2 than B_1 did, allowing
us to complete the induction.
Check out the linear algebra text where you first saw the uniqueness
of dimensions proved, and see whether the argument there is
"essentially" the above.
----------------------------------------------------------------------
You ask about the relationship between the concept of
Galois connection in my notes (section 6.5), and the one at
http://en.wikipedia.org/wiki/Galois_connection .
Note that in Exercise 6.5:2, I give an equivalent description of a
Galois connection, and in the second half of that exercise, I generalize
it to partially ordered sets. This corresponds to the "Alternative
Definition" in the Wikipedia article, which they call an "antitone
Galois connection". Their definition of a "monotone Galois connection"
is simply a Galois connection in that sense between a poset A and the
opposite of a poset B.
----------------------------------------------------------------------
Regarding Lemma 6.5.1, you write
> The first couple conditions given in this lemma look like theorems
> of intuitionist logic regarding negation:
>
> A -> B <=> ~B -> ~A
> A => ~ ~A
> ~A <=> ~~~A
>
> but without the classic
> ~~A => A
>
> which would be analogous to the lemma condition (ii) being
> equality rather than inclusion. Is there a connection here?
I don't know. I haven't studied intuitionist logic; but it
sounds interesting. Maybe one has a Galois connection on
propositions, given by "is incompatible with under intuitionist
logic" ... ?
----------------------------------------------------------------------
> In example 6.5.6 what is a radical ideal?
In a commutative ring R, the radical of an ideal I is the set of
all elements r\in R such that some power r^n lines in I. This is
itself an ideal. A "radical ideal" is an ideal that is its own
radical; i.e., that contains r if it contains r^n for some
positive integer n.
In the context of 6.5.6, note that if a polynomial f has the
property that f^n(a_1,...,a_n) = 0 for some (a_1,...,a_n)\in \C^n,
then f(a_1,...,a_n) = 0 for the same (a_1,...,a_n). Using this
observation it is not hard to see that for any subset A of \C^n,
the set of polynomials A* is a radical ideal of the ring of
polynomials.
----------------------------------------------------------------------
You ask about generalizations of the duality on convex sets that
I describe in Example 6.5.7; in particular, in the case of
polyhedra in R^3, mentioned in class.
It looks to me as though it should be possible to generalize the
duality to nonconvex polyhedra X whose faces don't contain the
origin, 0. Namely, write the plane of each face F_i of X in the
form f_i(x) = 1 where f_i is a linear functional, and take the
vertices of the dual to be these points f_i. Let two vertices
f_i, f_j be connected by an edge in the dual if the faces F_i, F_j
meet at an edge in X, and let the dual have a face with vertices
f_{i_1},...,f_{i_k} if the faces F_{i_1},...,F_{i_k} meet in
a vertex in X. This duality wouldn't correspond in any way I can
see to a Galois connection, but it looks fairly easy to work with.
However, there would be complications: a non-convex polyhedron can
have more than one face lying in the same plane, and this would lead
to the dual polyhedron having vertices that have to be "counted more
than once". So one would have to set up a theory of polyhedra with
vertices possibly counted more than once, and if one thinks about it,
the same phenomenon for edges, and probably faces.
One could also approach the construction more abstractly, using
a formal description of a polyhedron in terms of abstract vertices
edges and faces, with an incidence relation. I don't know just what
properties the incidence relation should be assumed to have, but
the properties would probably be self-dual, and so allow dualization.
You write that you tried something like that out for a torus,
and it seemed to be self-dual. This is probably because the
Euler characteristic, V - E + F, is unchanged under interchanging
V and F. But this wouldn't work in higher dimensions; first,
because in that case the Euler characteristic doesn't completely
determine the structure of the manifold, and second, because in even
characteristic, dualization would change the sign of Euler's formula.
----------------------------------------------------------------------
Regarding Example 6.5.8 you ask about my assertion that when
X is a ring of abelian-group endomorphisms of M, so that M is
an X-module, then X^* is the ring of X-module endomorphisms of M.
Well, have you written out, on the one hand, the condition for
t\in T to belong to X^*, and on the other hand, the condition
for t\in T to be an X-module homomorphism, and compared them?
If you did, but don't see why the resulting properties should be
equivalent, then send me the conditions you have written down, and
I'll say more.
----------------------------------------------------------------------
You ask about the assertion in the paragraph following Exercise 6.5:8
that the set of propositions implied by s \vee t is the intersection
of the set of propositions implied by s and the set of propositions
implied by t.
Well, let me know how far you were able to get. There are two
parts to such a statement of equality: that any proposition
implied by s \vee t must be in that intersection, and that
any proposition in that intersection must be implied by s \vee t.
Can you prove either one of these statements? (If you have trouble
with one of the directions, you might ask yourself "What might
examples of propositions s, t and another proposition p for
which the desired implication doesn't hold look like?") When
you've gotten as far as you can with this, let me know what you
see and what you don't see, and I'll help.
----------------------------------------------------------------------
Regarding Galois connections (section 6.5) you ask under what
conditions the closed sets of one of the resulting closure
operators will be the closed sets of a topology. Since the
class of closed sets under a closure operator is automatically
closed under arbitrary intersections, the conditions that have
to be satisfied are that the empty set be closed, and that
the union of two closed sets be closed. (The latter is
condition (b) of Exercise 6.3.15.)
There are nice conditions one can assume on a relation
R \subseteq S x T that will imply these properties. To make the
empty set closed, one can assume that there is an element of T
that relates to no elements of S. To make unions closed, one can
assume that for any two elements t_1, t_2 \in T, there is an
element t_1\vee t_2 \in T, such that the elements it relates
to under R are precisely those to which either t_1 or t_2
relates. (Cf. first display after Exercise 6.5:8.) This may seem
unnatural -- it does not hold in "typical" Galois connections --
but neither do "typical" Galois connections have the property that
unions of closed subsets are closed. An example where it does hold,
other than languages with an operator \vee and their models, is
Example 6.5.6 (points of complex n-space, and the polynomials that
are zero at those points). For any two polynomials t_1 and t_2,
their product can be used as "t_1\vee t_2"; and the Zariski topology
on complex n-space arises from this Galois connection.
The sufficient conditions described above are not necessary.
For instance, to get the empty set to be closed, it suffices
that for _each_ element of S there be an element of T which
does not relate to it; and one can similarly weaken the
condition that leads to finite unions. However, my guess is
that the properties I've described will tend to give the most
natural cases where the closed sets under the Galois connection
form the closed sets a topology.
----------------------------------------------------------------------
You ask, concerning the first sentence of Example 6.5.9, where I refer
to "objects of this sort", whether this means sets with operations
of a given signature; and also whether there are any restrictions on
the language in which the propositions comprising T are expressed.
My intent was to be very general. Rereading that Example, I think
that in the first line, after "Let S be a set of mathematical
objects", I ought to add, "of a given sort (e.g., of groups, of
positive integers, of topological spaces, ...)". Then, hopefully, the
sense of "of this sort" later in the sentence will be clear. The
propositions can likewise be in any language -- all that is needed
to get a Galois connection is that for each object s\in S and each
proposition t\in T, it makes sense to say whether s satisfies t.
The point of this Example is show that the basic ideas of studying
sets of mathematical objects determined by propositions that they
satisfy, and sets of propositions determined by the mathematical
objects that satisfy them, can be looked at as an example of the
concept of Galois connection.
Of course, when one wants to study such a situation further, one will
generally want restrictions of one sort or another. For instance,
the paragraph following Exercise 6.3:8 discusses certain properties
that follow if the language in question includes the operators
"or" and "and" (interpreted in the standard way), and T is closed
under those operators. The material of Chapter 9 concerns the
situation you asked about, where S consists of all sets with
operations of a given signature. There, T consists of all identities
in those operations (but does not contain expressions formed with
"or". Whether we allow "and" (written "\wedge") doesn't really
matter, because, e.g., if t_1, t_2, t_3 are identities, then the
set of algebras satisfying all members of, say \{t_1\wedge t_2, t_3\}
is also the set satisfying all members of \{t_1, t_2, t_3\}.)
----------------------------------------------------------------------
You ask about the relation between closed subsets, i.e., sets fixed
under "**", where "*" is the operator in a Galois connection, and
topological closure, such as comes up in Galois theory of infinite
field extensions.
Well, closure operations occur throughout mathematics, as the examples
given in section 6.3 show. It happens that in one area, topology,
one deals with a closure operation that is simply called "closure".
This is perhaps what led you to look at the word "closed" in that way.
I guess when a closure operator is not finitary, i.e., when the
closure of the union of a chain of subsets can contain more elements
than the union of the closures of the sets in the chain, then the
way these new elements arise is often given by a topological closure.
So when this happens, as in the field extension case you mentioned,
topology will be involved. I don't know whether every closure
operator on a set can be decomposed somehow into a finitary closure
operator and a topology ... . But anyway, as Exercise 6.5:3 shows,
every closure operator on a set can be looked at as arising from some
Galois connection, but as Exercise 6.3:17(i) shows, those closure
operators that come from topologies alone are highly restricted.
----------------------------------------------------------------------
You asked about my use of the phrase "(generally infinite)" in the
fourth-from-last paragraph of chapter 6.
The reason I put that phrase in was, of course, that we usually
see the conjunction symbol used to connect two propositions,
or, if several conjunction symbols appear, finitely many; and I
wanted to make clear that the finiteness restriction that is
automatic in such cases was not being assumed there.
I could equally well have expressed this by writing "(possibly
infinite)". In choosing to say "generally", I was implicitly assuming
that in mathematics, infinite structures (such as the ring of integers,
or the real numbers) are more often of interest, and finite structures
a more special case. But for mathematicians who specialize in the
study of finite objects (e.g., finite groups, finite lattices, etc.),
the reverse is true. So there was no absolute justification for my
choice of word.
----------------------------------------------------------------------
You write that you have heard that some mathematicians remove the
existence of identity morphisms from the definition of category
(Definition 7.1.2).
I don't recall hearing that. Can you point me to an example?
You say that this is as reasonable as treating semigroups and
monoids along with groups. I'll agree that it is as reasonable as
considering semigroups -- but not that it is as reasonable as treating
monoids! The definition of "monoid" embodies the natural structure
on the set of endomorphisms of a mathematical object. There are indeed
cases where the definition of a semigroup describes a structure
that one wants to deal with; but these come up only in more complicated
situations: when one wants to look at endomorphisms of a mathematical
object that satisfy some restriction which respects multiplication,
but isn't satisfied by the identity map. E.g., "all maps of
the infinite set X into itself that have finite image", "all
non-one-to-one maps of X into itself", etc.. Generally, these
are not "stand-alone" examples; rather, they occur as subsemigroups
of monoids (in the above two cases, the monoid of all maps of X
into itself).
Why don't I similarly introduce "nonunital categories"? There are
endless tangents one can go off on, and one has to limit what one
covers, both for reasons of time, and to give a unified subject matter
that the student can absorb. If it seems from the notes that I
am inclined to go in all directions, this is illusory -- I give
very varied examples of the major concepts (such as "universal
constructions") so as to provide a full perspective for understanding
them. But in the basic concepts that I am presenting, I try to
stick to the important ones, and not throw in less important variants.
After learning about categories in the standard sense, the student who
has reason to study structures that are essentially subsystems of
categories closed under composition but not under containing all
relevant identity morphisms can easily do so.
----------------------------------------------------------------------
Regarding constructions like G_{cat} for G a group,
and P_{cat} for P a partially ordered set (in the 6 paragraphs
preceding Exercise 7.2:1), you write
> ... the emphasis on relating the structure of certain categories
> to the structure of mathematical objects such as monoids, groups,
> partial orders, etc., is much greater than in previous introductory
> texts which I have read ...
>
> Is this primarily intended as a way of providing many "concrete"
> examples ... or ... will it become a useful mathematical tool ...
I would say that my primary motivation was neither of these: it
was to show that categories are "the same sort of things" as groups,
partially ordered sets, etc.. To which I will add that it is equally
important to see categories as being "of a different sort", in that
they can represent in one entity the vast array of all structures in
a field of mathematics. But the way categories are used makes the
latter viewpoint clear, while the viewpoint of them as mathematical
structures like groups, monoids, etc. often gets overlooked. It is
worth having both complementary understandings.
Secondarily -- yes, these constructions give a nice class of
examples of categories; examples different from the sort that
one usually sees.
And finally, there are some uses for such examples. For instance,
we will see that if G is a group, then an "action" of G on an
object of a category C is equivalent to a functor G_cat -> C;
so constructions like "the fixed-point subobject of the action of
G on X" will be expressible as an instance of the "limit" of
a functor.
----------------------------------------------------------------------
You ask about the meaning of "isomorphic" in exercise 7.2:1.
An isomorphism i: C --> D of categories, like an isomorphism between
other mathematical objects, means a way of mapping the elements
comprising C bijectively to the elements comprising D so that the
structure is exactly preserved. In detail, this means a bijection
i_{Ob} of object-sets, and for each pair of objects X, Y \in Ob(C),
a bijection i_{X,Y}: C(X,Y) --> D(i_{Ob}(X), i_{Ob}(Y)), which
respects composition (and identity morphisms, though that follows
from the other conditions).
----------------------------------------------------------------------
You ask about the omission of arrows representing composite morphisms
in diagrams of categories (e.g., in the paragraphs following
Exercise 7.2:1).
In general, diagrams that we draw show morphisms that are going
to be discussed, and that are not merely composites of other
morphisms shown. If we tried to draw all the morphisms in a category,
the result would usually be far too complicated and confusing to
the eye. Our pictures simply show the key things we need to focus
attention on.
We don't _always_ omit all morphisms that are composites of others that
we show. E.g., in the diagrams in Proposition 4.3.3 and its proof,
we showed the diagonal arrows even though they are (after the fact)
composites of the horizontal and vertical arrows. But this is because
conceptually they were given before the vertical arrows, and the
properties characterizing the vertical arrows required the diagonal
arrows to state them.
So we omit composite arrows when we can. My showing the diagonal arrow
in the first display after Exercise 7.2:1 was exceptional -- based on
the very introductory nature of this section.
----------------------------------------------------------------------
You ask about naturally occurring examples of composition of relations
(2nd paragraph after display (7.2.1)) other than the case of functions.
Well, a lot of the things that are loosely called functions but
aren't really can be thought of as relations. In calculus texts
one sees "the function 1/x from real numbers to real numbers",
but it is not a function because it isn't everywhere defined; and
in some contexts one talks about "multivalued functions", such
as "+- sqrt x". The obvious way of "composing" these corresponds
precisely to composition as relations. Phrases like "is a friend
of a client of --" can be thought as the composite of the relation
"is a friend of" and "is a client of". But mostly, I would say that
if and when one has a question about composition of relations, one
can use the definition itself, and gain experience with the concept
by applying it in trying to answer one's question. It isn't a
major topic of this course, so there will be few such questions here.
(I have a preprint which considers, among a number of other structures,
the monoid of self-relations on a set, under the composition operation,
about which there are some open questions; you can look at it at
http://math.berkeley.edu/~gbergman/papers/embed.pdf .)
----------------------------------------------------------------------
You ask about the term "partial operation" used two paragraphs
above Exercise 7.2:2.
A "partial function X --> Y" means a function from a subset of
X to Y. E.g., in Math 1A, when one speaks of "the function 1/x"
or "the function sqrt x", these are partial functions from the real
line to the real line. A partial binary operation on a set X is a
partial function X x X --> X.
In particular, if X is the set of all germs of analytic functions at
points of the complex plane, then one has a partial operation of
composition, since one can sometimes compose the germ of a function
f at a point z_1 with the germ of a function g at a point z_0,
namely, if and only if g(z_0) = z_1. As stated, these are exactly the
cases needed to make "GermAnal" a category.
----------------------------------------------------------------------
I'll somewhat arbitrarily put this question about "empty composites"
with material related to section 7.3, though it was actually asked much
later.
> ... Is there a nice way to define the composite of a collection
> (or sequence?) of functions such that the composite of the empty
> collection is the identity map? ...
Well, see what you think of this definition. Suppose we are
given n+1 objects X_0,...,X_n in a category C, and for 0\leq i < n,
a morphism f_i: X_i -> X_{i+1}. We want to give the simplest possible
definition of their composite, a function X_0 --> X_n. So we will
recursively define \prod_{i=m-1} ^{0} f_i: X_0 --> X_m. The
recursive step will obviously be
\prod_{i=m} ^{0} f_i = f_m (\prod_{i=m-1} ^{0} f_i).
What should we take as the base step? The naive answer would be to
make the base step the definition of the product with m=0 by
(\prod_{i=0} ^{0} f_i) = f_0. But I would say that a more elegant
solution is to define the empty subproduct of this chain of
morphisms, \prod_{i=-1} ^{0} f_i, as id_{X_0}: X_0 -> X_0. That
way, "f_0" gets introduced at the m=0 recursive step just as each other
f_i gets introduced at the m=i step.
How does that look?
----------------------------------------------------------------------
You ask how one can allow a member of Ar(C) to belong to more
than one hom-set (one page into section 7.3), given that they are
drawn as arrows with definite source and target.
The fact that we draw them that way isn't part of the definition of
a category! It is simply a convenient way that we picture morphisms.
So it is our right to draw diagrams that way that one might question,
not whether one can allow morphism-sets to overlap.
To the question "How can we justify drawing diagrams with each
arrows having a source and target, when a given element may lie
in more than one morphism set?", I think the right answer is that
the arrow f we draw from X to Y represents f "regarded as a member
of C(X,Y)"; and if we want to formalize that concept, we can do it
by saying that the arrow really represents the 3-tuple (X,Y,f).
This is not really different in nature from such questions as how
we can justify writing the composite of elements f and g of
a group G as fg, given that the underlying set |G| admits many
group operations, and the product will be different in one than in
another. The answer to that one is that fg is our shorthand
for \mu_G(f,g), and it is safe to use such shorthand when we are
not explicitly dealing with more than one group-structure on the
same set (or on groups with overlapping underlying sets).
> ... Also, since the text will not require that hom-sets be
> disjoint, what advantages will this give? ...
So far as I am concerned, only the advantage of not alienating people
who are used to the definition saying that a function f: X --> Y
is a subset of X x Y. To such people, the ordinary systems
of sets and maps that they are used to would not form categories
if we used the more restrictive definitions. Too many categorists
don't care about that -- they take the attitude "We know the right
way to do things", don't try to make them intelligible to the
general mathematical community, and wonder why category theory
is underappreciated!
But once one is doing category theory, the things one is interested
in from one point of view can always be translated into a variant
language; so a student who has read this text should have no trouble
adjusting to the axiomatics of a text that assumes hom-sets disjoint.
----------------------------------------------------------------------
In connection with the discussion on the last two pages of section 7.3
of attitudes about categories, you ask whether category theory "has
any content", or is, as Wittgenstein said of logic, merely "a
tautology".
Well, it has been said that all of mathematics consists of tautologies.
Insofar as that is true, it is true of category theory in particular.
A tautology is a statement that is automatically true; and it is
usually thought of as therefore being a statement that is obviously
true (such as "X = X"). But a statement can be automatically true
without being obvious; and I think the nontrivial results of
mathematics fall into this category. So being tautologies does
not keep them from being powerful and useful additions to human
knowledge.
----------------------------------------------------------------------
You ask (in connection with the discussion at the end of section 7.3)
whether category theory is essentially a language in which to say
things about existing fields of mathematics, or is a field with
nontrivial content .
I say, ask yourself that question at the end of this course!
----------------------------------------------------------------------
You write that you feel that the Axiom of Universes (section 7.4)
doesn't seem as believable as the other axioms, which you feel
are clearly true.
Well, I don't consider the axioms of set theory to be "true" or "false"
(cf. section 5.7); I would judge them in terms of whether or not they
form a useful model for the way we think about collections of entities,
which enables us to reason precisely about these.
Regarding the Axiom of Universes, the first few paragraphs of
section 7.4 give reasons for setting up a set theory in which
universes exist: that if we start with a set theory that merely
satisfies ZFC, we would like to be able to talk about the
collection of "all sets"; but that won't be a set. If we set
up assumptions that that allow us to treat these classes just
like sets, why not rename them sets and assume ZFC applies
to these new things we are calling sets? And if we do this once,
why not allow the process to be iterated indefinitely, and express
this as the Axiom of Universes? We can never get away from the
problem that *all* the things we are considering sets will be a
collection that is not a set; but we'll have a system where the
damage that that fact does has been essentially eliminated.
Finally, as I said in class (I don't know whether you were
already away at that point), most of the Axioms of ZFC consist
of weakenings of the Axiom of Abstraction (described about one
page after the list of axioms of ZFC). The general rule seems
to be that any weakening of that axiom is OK if it doesn't allow
us to define a set S in a way that requires one to already
"have" S available to consider in applying the criterion
for membership in S. And the Axiom of Universes is OK by that
standard: each universe is built up from sets constructed "before"
it. So -- since the Axiom of Abstraction seems intuitively "almost
true" -- it is reasonable to accept this instance of it as "true".
> ... Is it 'reasonable' to say that 'small sets' in the 'usual
> sense' correspond to elements of some such minimal universe?
Well, I'd rather not make such a convention. ZFC doesn't preclude
the existence of universes, so if you make the above convention,
then things that people just assuming ZFC called "sets" would not
necessarily be "small sets in the usual sense" under your convention.
Moreover, set theorists like to study "large cardinal axioms", e.g.,
the existence of a "measurable cardinal", and my understanding is
that almost all (all?) large cardinal axioms imply the existence
of some inaccessible cardinals, equivalently, of some universes
(though not necessarily the Axiom of Universes).
----------------------------------------------------------------------
Regarding the development of the Axiom of Universes in section 7.4,
under which a universe is a set, and every set is a member of a
universe, you ask, "Don't these imply that every collection of sets
is a set which is a contradiction?"
No, nothing in the axioms implies that every collection of sets is
a set. Under the axioms, for each set X there is a universe which
contains X; but that universe will vary from one set X to another.
So there is no assertion that one universe contains all sets.
----------------------------------------------------------------------
Regarding the discussion at the beginning of section 7.4, you ask
> ... you redefine "set" to "small" and "large" set. Then, later, you
> mention the lack of a "set" in ZFC that satisfies being a universe,
> though the class of all sets would be. I am understanding this to be
> in the old sense? The term "set" will always refer to a conventional
> set, or will set be used to encompass large and small set?
Actually, the third paragraph of section 7.4 was just meant to lead
the reader to the ideas developed in what followed; so when I said
"So let us change their names ...", I really meant "So suppose we
changed their names ...". As of the next paragraph, we begin
formally setting up what we really do. What we talked of loosely
as "old sets" and "new sets" are now both sets within the set
theory that we are discussing. We no longer need to consider large
sets "things we used to call classes"; though we can still say that
the members of U form a self-contained set theory, from within which
the things not in U look like "classes that aren't sets".
I hope this helps. Let me know if you still have difficulty with this.
----------------------------------------------------------------------
Regarding the idea of fixing some universe U, and considering
those objects of a given sort (including categories) that lie in U,
as described following Definition 7.4.1, you write:
> ... I'm having trouble understanding why we should expect any
> set of categories to be in this "standard" universe U. ...
If U contains, say, some set X of groups, then it will also
contain the category whose objects are the members of X, and
whose morphisms are the group-homomorphisms among these. And if
U contains a set Y of sets of groups, it will contain the set
of categories whose members are the categories constructed as
described above from the members of Y. Let me know whether you
have any difficulty with proving these statements, and/or if you
have difficulty seeing that any universe U will contain sets
of groups, and sets of sets of groups.
Intuitively, ZFC was set up to handle "everything that mathematicians
ordinarily do", and it does this quite well. Forming objects
constructed as ordered tuples, sets of mathematical objects, sets of
homomorphisms among them, etc., are among these things; so your
understanding of ZFC should include, at least in sketch form, an
understanding of how these things are done. Since categories are
defined as tuples with certain properties, ZFC can handle these
equally well; and since every universe satisfies ZFC internally,
these properties will hold in any universe. The one Achilles' heel
of ZFC is the impossibility of defining "the set of all --", where "--"
is not restricted in terms of some given set. The Axiom of Universes
partly overcomes this: it allows us to speak of "the (large) set of all
(small) --"; and that's what we must do when we define things like "the
category of all groups". But if you merely want to get some, and
indeed, lots of categories within U, that's easy: Just start with
some set of groups etc. within U and as described in the preceding
paragraph, form the corresponding category. There are other sorts of
categories that arise in ways different from "mathematical objects of a
given sort and morphisms among them", as discussed in the paragraphs
before Exercise 7.2:1. These are easy to apply in any universe as
well.
----------------------------------------------------------------------
In connection with Definition 7.4.4, you asked whether the concept
of an object such as a group G being U-small referred to the
whole structure, e.g., the group (|G|, \mu_G, \iota_G, e_G), being
a small set (a member of U), or just the set |G|.
Those two conditions are equivalent, since once |G| is U-small,
the map \mu_G, as a map |G| x |G| --> |G|, and hence a subset
of (|G| x |G|) x |G|, will also be U-small, and likewise for
\iota_G and e_G; hence the 4-tuple (|G|, \mu_G, \iota_G, e_G)
will be U-small.
Anyway, the intended meaning (in case there are cases
where the equivalence is evident) is that whole structure
(|G|, \mu_G, \iota_G, e_G) is a U-small set.
----------------------------------------------------------------------
Regarding the notion of a large group, mentioned in Exercise 7.4:1(ii),
you ask "Do examples of this naturally appear? Are there interesting
theorems about these or are they essentially the same as small groups?"
Remember that a group that is "large" with respect to one universe
will be "small" with respect to another universe. Generally, one
will study a given group within a universe to which it belongs, and
there it will always be "small".
Anyway, it's easy to get groups of arbitrarily large cardinalities;
e.g., free groups on big sets; and a group of large cardinality won't
lie, even up to isomorphism, in a universe of smaller cardinality.
Perhaps your question really means "Are there interesting results
that hold for groups lying in some universes that don't hold for
groups lying in others?" Well, set-theorists are interested in
"large-cardinal axioms", and if we consider a universe determined
by a cardinal of the sort that one of those axioms concerns, some
set-theoretic statements will be true that are not true in a
sub-universe not satisfying the same large-cardinal axiom. Probably
there are ways of encoding some of those set-theoretic statements
in terms of the existence of groups with specified properties. But
I'm not familiar with the field, and can't say whether such results
would be of group-theoretic interest.
----------------------------------------------------------------------
Regarding my comment near the end of section 7.4 that if the
Axiom of Universes should not prove adequate for future needs,
one might assume a "Second Axiom of Universes", you ask "Can we
make a Third, Fourth, or nth Axiom of Universes?"
Certainly; but I don't see the point. The Axiom of Universes is
simple, and does what we wanted and more. There's no reason to
assume that the next challenge to the adequacy of set theory, if
there is one, will come from the same direction; so rather than
barricading ourselves against danger from that direction, we should
just be on the lookout for what may come. (If it does come from
that direction, we could just add such axioms.)
> Another way of saying this would be: how long of a chain of
> universes (ordered by "\in or =") can we get? ...
Well, as you note, using the Axiom of Universes, one can get chains
of universes as large as any ordinal (in any of those universes)!
> ... the following reasonable-sounding axiom: "given any collection
> of sets, there is a universe containing all of them," ...
That won't work: "all sets" is a "collection" of sets, but since a
universe is itself a set, we can't have a universe containing them all.
----------------------------------------------------------------------
Regarding the comment just before Definition 7.5.4 that "faithful"
and "full" aren't the only analogs of "one-to-one" and "onto" that can
be considered for functors, you ask what some of the others are.
There are none that come up often enough that I knew their names;
but looking online, I see that a functor F: C --> D is called
"representative" or "dense" if for every object Y of D there
is an object X of C such that F(X) is isomorphic in D to Y;
a kind of "onto-ness" property.
One could consider the "one-one-ness" property of taking
non-isomorphic objects to non-isomorphic objects, but I haven't
found anyplace where that is given a name.
There are lots of adjectives used to describe kinds of functors;
but most of the properties in question are of different sorts
from one-one-ness and onto-ness.
----------------------------------------------------------------------
Regarding Definition 7.5.7 you ask
> ... is it meaningful to speak of a hom-functor for a non-legitimate
> category? In this case, the C(X,Y) are not sets.
Yes they are!! I think you were in class Monday, when I emphasized
that "large sets" are still sets within our version of ZFC -- they
just aren't members of whatever universe U (itself a set) we happen
to be focussing on; but they do belong to some larger universe U'
by the Axiom of Universes.
As noted in the second paragraph after Exercise 7.4:6, our version of
set theory still has the property that "all sets" don't form a set.
But "large sets" that we talk about definitely are sets.
----------------------------------------------------------------------
You comment that antihomomorphisms of groups should have a similar
role in group theory to contravariant functors (Definition 7.6.1)
in category theory, and you ask why antihomomorphisms of groups
are rarely talked about.
Every group has a canonical antiautomorphism, the map
x |-> x^{-1}; so to give an antihomomorphism a: G --> H
is equivalent to giving a homomorphism, f: G --> H defined
by f(x) = a(x)^{-1}. So anything one might want to express
in terms of antihomomorphisms can be expressed in terms of
homomorphisms.
Monoids and rings don't have such canonical antiautomorphisms,
so one does, occasionally, look at antihomomorphisms among
them. In particular, one often talks about involutions of
rings (see 2nd half of paragraph before (6.4.1)). Aside
from this, though, on those occasions when antihomomorphisms
of rings R --> S come up, one most often writes them as
homomorphisms R^{op} --> S; I guess because, as things
that don't come up often, it is more comfortable to describe
them in terms of things one uses regularly. On the other
hand, contravariant functors are very common in mathematics,
so one refers to them as such.
----------------------------------------------------------------------
You ask whether the Galois correspondence (I guess you mean the
correspondence between subgroups of a Galois group and intermediate
fields) is a functor (section 7.5).
Well, it can certainly be made a functor by regarding the
subgroups of the Galois group as forming a category with
inclusions as morphisms, and similarly for the intermediate
fields: they give anti-isomorphic partially ordered sets P
and Q, which translates to a contravariant functor (section 7.6)
P_cat^op --> Q_cat.
There is a more sophisticated category-theoretic approach to Galois
theory which you might find more interesting. I don't know the
details, but Lenstra used it in teaching Math 250A one year. A lot
of the students found it very difficult, and since I taught 250B the
following semester, I ended up having to re-teach them Galois theory
the traditional way. But Lenstra's notes from his 250A are online
at http://websites.math.leidenuniv.nl/algebra/topics.pdf , and you
might want to look at them. I think the Galois theory itself begins
around the bottom of p.114, though it depends on lots of ring- and
category-theoretic preparation in the preceding sections.
----------------------------------------------------------------------
Regarding the concept of a faithful functor (Definition 7.5.4), you ask
> ... if we have a faithful functor from X to Y and a faithful functor
> from Y to X, (and, say, both are injective on the object-sets)
> are they isomorphic?
Nope. There are counterexamples with 1-object categories X and Y,
where the monoids of endomorphisms of the unique objects are in
both cases abelian groups. (Not finite abelian groups, of course.)
Can you find such an example?
----------------------------------------------------------------------
You note that for the definition of a product category C = \prod C_i
in Definition 7.6.4, the Axiom of Choice guarantees that Ob(C), as
defined, will have at least one element, but you ask whether it need
have any others, and how we can be sure that it is "what we want
it to be".
Well, by its definition, a product set is always "what we want it to
be"; the function of the Axiom of Choice is to guarantee that it will
have the properties we expect it to. That axiom tells us here that if
the categories C_i all have nonempty object-sets, then so will C.
If they each have just one object, then of course Ob(C) =
\prod Ob(C_i) will also just have one. You should be able to
prove from ZFC that if a family of nonempty sets does not consist
wholly of 1-element sets, then their product has more than one
element; as well as such statements as that if the family is infinite,
and each member has more than one element, then the product set has
at least 2^{\aleph_0} elements.
----------------------------------------------------------------------
> ... you mention that in categories there is no real way to
> distinguish isomorphic objects. Is there any danger, then, of
> always looking at the skeleton category, which identifies
> isomorphic objects? ...
From an abstract point of view, no. But in specific cases, it could
well be confusing. For instance, suppose we are looking at the
category Set, and are considering the ordinals as objects of
that category. Then the countably infinite ordinals (of which we
know there are uncountably many) all become isomorphic. Suppose we
go to a skeleton category of Set in which the only representative
of that isomorphism class is \omega. We could still think about
the chain of inclusions of countable ordinals by taking a retraction
of Set to a skeleton, and looking at the images of the inclusion
maps \alpha --> \beta among the countably infinite ordinals as
certain endomorphisms of the object \omega to which they have all
been retracted; and we could, for instance, describe the first
uncountable ordinal as the direct limit of that chain of endomorphisms
of \omega. But it would be much easier to think of the category
Set in which the countable ordinals are separate objects.
----------------------------------------------------------------------
Regarding Exercise 7.7:2, you ask why this doesn't show that
the category theoretic notion of an isomorphism between objects
deviates from the normal mathematical notion, since in concretization
T, T(f) is a bijection, but in the others it is not.
I think you're assuming that the "normal mathematical notion" of
an isomorphism is a homomorphism that is bijective on the underlying
set. But that works only for those sorts of objects where such
maps have the property that their set-theoretic inverses are also
morphisms. For a case where that is not so, see Exercise 5.1:1.
You'll see there that the concept of isomorphism agrees with the
category-theoretic version, not with that of being a bijection on
underlying sets.
----------------------------------------------------------------------
Regarding the concept of epimorphism (Definition 7.7.2), you ask,
> ... Is there an accessible paper which gives precise criteria for
> epimorphisms in the category of rings?
Yes. See John Isbell's series of papers,
Epimorphisms and dominions. 1966 Proc. Conf. Categorical Algebra
(La Jolla, Calif., 1965) pp. 232-246, Springer
Epimorphisms and dominions. II. J. Algebra 6 (1967 7)-21
Epimorphisms and dominions. III. Amer. J. Math. 90 (1968) 1025-1030
Epimorphisms and dominions. IV. J. London Math.Soc. (2) 1 (1969) 265-273
Epimorphisms and dominions, V. Algebra Universalis 3 (1973), 318-320
But (as Isbell was always concerned with pointing out), the statement
of the criterion, his "Zigzag Lemma", in the first of the above papers
is wrong; it is corrected in paper IV.
> Also, are the problems in characterizing epimorphisms for rings
> similar to those for monoids?
Yes.
----------------------------------------------------------------------
Concerning the statement following Definition 7.8.1 that the empty
set is the initial object of Set, you ask "What is the morphism
in C(\emptyset, X)?",
We touched on this in reading #1: See the second paragraph of
section 2.4, with the key words, "there is exactly one".
I didn't go into details there, because I felt that the student who
thought this through would see it. You should look at the definition
of a function from a set X to a set Y, and ask yourself what
satisfies that definition when X is the empty set. If you have
trouble thinking this through, ask again.
----------------------------------------------------------------------
Regarding the concept of a free object with respect to a concretization
of a category (Definition 7.8.3), you ask
> Is there a generalization of free object for non-concrete categories?
Well, a free group (etc.) is a group F(X) with a universal
X-tuple of members of its _underlying_set_; so having an "underlying
set" is part of the essence of the concept, and the category-theoretic
abstraction of the underlying set is a concretization. But one can
get various sorts of generalizations depending on how far afield one
is willing to go, and still consider the result a version of the
"free object" concept.
For a small generalization, one can drop the faithfulness condition
in the definition of concretization. E.g., the functor taking every
group to the set of its elements of exponent 2 is not faithful,
but there is an analog of the free group for that functor, namely
the functor taking every X to the group presented by an X-tuple
of generators, together with relations saying that all those generators
(but not necessarily all other elements) satisfy x^2 = e.
Much more loosely, one could call the result of any universal
construction (or at least, any left-universal construction) a
"free" object for the relevant conditions. And some authors do.
In between these, one can consider the construction of the left
adjoint of a functor, which we will see defined in section 8.3 to be
(when it exists) a generalization of a free object construction.
----------------------------------------------------------------------
You ask about (co)products that are preserved under all functors
(possibly assuming that the codomain category has coproducts).
Well, this will certainly be true of the coproduct of any 1-object
family! My guess is that it will not be true in any other cases;
but I don't know.
----------------------------------------------------------------------
Regarding the concept of kernel in the 2nd paragraph before
Definition 7.8.7, you ask
> How closely does the categorical definition of kernel match our usual
> meaning? I'm not aware of any categories with zero objects in which
> the categorical definition differs from the standard definition, ...
My first reaction was that one would have to come up with a pretty
exotic category, and there would not be likely to be a "standard
definition" of kernel there! However, your suggestion
> ... but perhaps one could be concocted by taking some subcategory
> of Ab in which not every standard kernel is an object ...
works: if C is the category of divisible groups (Exercise 7.7:5),
then the map Q --> Q/Z has zero kernel in C under our
definition, but one would ordinarily say that it has kernel Z,
which is not in C.
----------------------------------------------------------------------
In connection with the concepts of pushouts and pullbacks
(Definition 7.8.7) you mention having seen pullbacks in algebraic
geometry, and ask whether pushouts of schemes is equally important.
Well, if you look at affine schemes, pushouts correspond to pullbacks
of rings. In particular, given two subrings of a ring, the pullback
of the diagram formed from that ring and those two subrings is the
intersection of those subrings. But intersection does not respect
the properties that algebraic geometers like, such as being Noetherian.
(Can you find an example of a finitely generated Noetherian ring and
two finitely generated subrings whose intersection is not Noetherian?)
Intuitively, forming a pushout of schemes corresponds to gluing two
schemes together in a manner prescribed by maps from a third; and
this gluing process can create singularities.
----------------------------------------------------------------------
You ask what I mean in the first sentence of Lemma 7.9.4 by unordered
pair of objects.
"Ordered pair" is the standard term for the sort of entity that we
write (x,y). If I said that there was no more than one morphism
between an ordered pair (X,Y) of objects, this might be taken to
mean that C(X,Y) had at most one element; but I want to say more:
that C(X,Y) and C(Y,X) each have at most one element, and that they
can't both have an element unless X=Y. So I use the phrase "unordered
pair of objects" to mean "two objects, with no difference in the
roles we assign them." I used the same phrase in Exercise 7.2:1,
where I made a precise statement, then used this phrase as an
informal translation.
----------------------------------------------------------------------
> What is the point of not assuming disjoint Hom-sets in the definition
> of a category?
As I say in section 7.3, I don't make that assumption "largely because
it would increase the gap between our category theory and ordinary
mathematical usage"; since under conventional definitions, where a
map f: X --> Y is a subset of X\times Y, f doesn't uniquely
determine Y.
Actually, in a many-sorted algebra A with family of underlying
sets (|A|_i)_{i\in I}, one generally doesn't assume the |A|_i
are disjoint; otherwise every time one constructed such an algebra,
one would have to check whether one's construction accidentally
produce elements belonging to more than one |A|_i. So in studying
such algebras, one doesn't want to put in such a requirement.
(E.g., in studying actions of groups on sets, one might want to
consider pairs G, S where G is a group and S a G-set; these
would have two underlying sets |G| and |S|. But sometimes one
wants to consider the action of a group on itself.) And a category
can be looked at as a many-sorted algebra.
So one might say that the choice is between optimizing things for the
person who, given that C is a category, wants to say things about
it, and for the person who, given some mathematical situation, wants
to say "such-and-such is a category". For the former, it would be
best that hom-sets be discrete; for the latter, that this not be
required.
----------------------------------------------------------------------
You ask how one can prove the statement in the paragraph following
Exercise 7.9:9 that "there is no natural way to make a contravariant
functor out of P_f."
As I use it, the word "natural" is an informal term, like
"obvious" or "reasonable", so it doesn't require a proof. The
sentence simply means that none of the ways that we discovered,
when we discussed the power-set construction, to turn maps among
sets into maps among their power sets, gives a contravariant functor
that takes finite subsets to finite subsets.
However, you might try investigating whether there is or is not
any way to make the construction associating to each set the set
of its finite subsets into a contravariant functor, and if you
can answer the question either way, hand it in as a homework problem.
----------------------------------------------------------------------
Regarding the definition of equivalence of categories (Definition 7.9.5)
you ask:
> ... If F is covariant must G be too? And if F is contravariant
> G also?
Right. More precisely, by the last sentence of Definition 7.6.1,
"functor" means "covariant functor" if the contrary is not stated.
So the definition of equivalence should be interpreted with both F
and G being covariant functors. Then a "contravariant equivalence"
between categories C and D is an equivalence (in that sense)
between C^op and D.
> Is there an example in which FG=Id_D but GF is only isomorphic
> but not equal to Id_C?
Yes. Let C be a category, and D any skeleton on C. Let F
be the functor determined as follows: For each object X of C
let F(X) be the unique object of D isomorphic to X in C, and
choose an isomorphism f_X: X -> F(X), using the identity
isomorphism whenever X \in Ob(D). For h: X -> Y, define
F(h) = f_Y h (f_X)^{-1}. (Draw the diagram to see how this works.)
Let G be the inclusion functor of D into C. Then you'll see
that FG=Id_D but GF is only isomorphic to Id_C.
----------------------------------------------------------------------
> Is the last condition in Lemma 7.9.6 what some people call
> essentially surjective?
Yes. I hadn't encountered the term, but doing a Google Book Search,
I see that it is used that way.
----------------------------------------------------------------------
Concerning the statement after Definition 7.9.7, that the Axiom
of Choice allows us to construct a skeleton for every category, you
ask how we can do this for categories that are not small.
It sounds as though you are still thinking in terms of the paragraphs
of motivation at the beginning of section 7.4, which suggested that
"small" and "large" sets might be used as new names for what had been
called sets and classes. But what we moved on to in that section was
a set theory in which "small sets" were sets in a given universe, and
"large sets" were any sets within our set theory; and in which ZFC,
and so in particular, the Axiom of Choice, applied to the set theory as
a whole, not just to small sets. (We still have "proper classes",
subclasses of the class of all sets, and we can't apply the Axiom of
Choice or our other axioms to these. But a "large category" by
definition has a _set_ of objects; it belongs to our set theory;
it is merely not required to belong to the distinguished universe
U within that set theory.)
----------------------------------------------------------------------
You write that the most intuitive definition of equivalence of two
categories is that they have isomorphic skeletons (Lemma 7.9.8),
and ask whi I didn't present this one first.
Well, the definition I gave tends to be the one that comes up more in
the motivating situations -- we have a way of constructing an object of
D from any object of C, and vice versa; and while the composites of
those constructions are not quite the identity functors of C and D,
there are obvious isomorphisms of them with those identity functors.
On the other hand, the skeleton of a category, while formally
convenient, can be intuitively rather far from the original category
-- e.g., in a skeleton of Group, we can't look at the various
infinite cyclic subgroups of an infinite cyclic group Z as distinct
groups -- they're "the same" group Z, mapped into itself by different
morphisms.
Anyway, as I've said before, it's good to have different ways of
understanding the same mathematical concept. Which one gets
introduced first is often a lesser matter.
----------------------------------------------------------------------
In connection with section 7.11, you write:
> Another example of an enriched category: the hom-sets in the category
> of sets with relations as morphisms have a boolean algebra structure.
That's an interesting observation. But to make it an enriched
category structure, one would have to figure out how composition
of morphisms behaves on the given pair of Boolean algebras.
It looks to me as though it will respect joins, but not meets
or complements; so maybe it has to be weakened to an upper
semilattice structure.
----------------------------------------------------------------------
Regarding the final paragraphs of section 7.11, you write:
> ... you allude to a nontrivial morphism between morphisms between
> morphisms in the category CatCat. ...
Not to a nontrivial morphism, but to a nontrivial *concept* of
morphism; i.e., a way of defining "morphism between morphisms between
morphisms" that doesn't just reduce to something one can define in any
category or Cat-category.
Recall that in defining the concept of "morphism of functors", we
used the fact that functors have objects as their outputs, and that
there can be morphisms from the objects produced by one functor and
the objects produced by another. Now if we have two morphisms P and
Q between a pair of functors F and G between Cat-categories C
and D, the morphisms that comprise P and Q may in turn have
morphisms between them (because D is a Cat-category), and if some
choice of these morphisms give the proper commuting diagrams, then we
consider the resulting family of morphisms to be a morphism m from
P to Q, i.e., a morphism m of morphisms P, Q of morphisms F, G
between objects C and D of CatCat.
----------------------------------------------------------------------
I'll count this question as belonging to section 7.11:
> ... Is there a way to regard a ring as a one-object category, much
> as there is a way to consider monoids and groups?
Yes. Rings are 1-object Ab-categories. (Recall that an Ab-category
is a category whose morphism-sets all have structures of abelian
groups, such that composition is bilinear.)
----------------------------------------------------------------------
> As is mentioned in Section 8.1, all universal objects are initial
> objects in some category. But to show this, don't we need some
> abstract way of constructing a category in which a universal object
> is initial? The examples given in 8.1 seem rather ad hoc, as they
> depend on the specific universal object being considered.
Well, we don't yet have a general definition of "universal object"!
In Chapter 4 we displayed a large bunch of interesting constructions
which seemed to have a feature of "universality" in common; and
in section 7.8 we found that some families of these could be gathered
under common category-theoretic descriptions. In the first half of
Chapter 8 we will look at these more systematically. From the point of
view of section 8.1, "universal object" might be defined as "structure
corresponding to an initial object in some category"; in later sections,
we look at specific sorts of properties that can be described as in
this way, but that have more natural descriptions in terms of other
categories, and how to pass between the two sorts of description.
----------------------------------------------------------------------
You ask about my assertion in the paragraph preceding Exercise 8.2:2
that the functor U^\omega is represented by the free commutative ring
on a \omega-tuple of generators.
For any ring R, U^\omega(R) is the set of \omega-tuples of
elements of R (see second sentence of Definition 7.8.5),
and the ring with a universal \omega-tuple of elements is the
free ring on an \omega-tuple of generators. (See discussion
in the second paragraph section 8.1 of the free group on 3 generators
as the initial object in the category of groups with specified
3-tuples of elements. As discussed at the beginning of section 8.2,
this can be translated as saying that that group is a representing
object for the functor associating to each group G the set of
3-tuples of elements of G.)
----------------------------------------------------------------------
Regarding the statement you read somewhere that Cayley's Theorem is
a case of the Yoneda Lemma (Theorem 8.2.4), you write
> ... Since the one object R of G_cat is just an abstract construction
> I don't really understand what h_R means in this context.
You can call that object "an abstract construction", but the definition
of h_R still applies to it -- go to that definition, take R to be
that one "abstract" object of G_cat, and see what object of what
category h_R takes R to. Then remember that the concept of functor
involves both objects and morphisms. So now check what Yoneda's
Lemma says about morphisms in this case.
----------------------------------------------------------------------
You ask about the reason I reserve the word "free" for the
construction of the left adjoint of an underlying-set functor
(e.g., paragraph before Exercise 8.3:3), while other books you have
read use it for more general left universal constructions.
I can't be sure about the books you are referring to without
knowing which they are and looking them over, but I suspect
that they do not develop or assume known to their readers the
general concepts of universal constructions, representable
functors, and adjoint functors. I suspect that if they did,
they too would use those terms in many places where they now
use "free".
Of course, since the word "free" is short and suggestive,
there still might be a temptation to use it in place of
the more technical-sounding terms. But despite this, I
think that when one has a more precise language available,
one will use it.
It's no fault of those authors -- they are writing to an
audience not familiar with the concepts of this course.
----------------------------------------------------------------------
You ask whether there is a connection between units and counits
of adjunctions (Definition 8.3.9), and units and counits of algebras
and coalgebras.
Well, given an adjunction, if we write UF = T, then the unit
gives us a morphism I --> T, and from the counit we can get a
morphism T^2 --> T. In general, a functor T from a category
into itself given with morphisms I --> T, T^2 --> T satisfying
certain conditions is called a "monad" by some, a "triple" by
others; and it is thought of as analogous to a monoid M, looked
at as a set |M| with maps |M|^0 --> |M|, |M|^2 --> |M|
satisfying similar identities. Since the map |M|^0 --> |M| is
called the unit of M, the corresponding operation of a monad
is called the unit of the monad. A dual sort of structure, called
a "comonad", has a "counit". Adjunctions, being very symmetric,
have both.
The above connects the unit of an adjunction with the unit of a
monoid. If one abstracts the definition of a monoid in terms of
the category Set, then applying the same concept to the enriched
category Ab, one gets the concept of a ring; or, using instead
of Ab the category of k-modules, the concept of k-algebra, which
you asked about. Dualizing, one gets the concept of coalgebra.
I'm not sure whether one can actually bring the concepts of
monoids and monads under one hat; I suspect so, namely that if
M is a monoid, then the functor |M| \times -- : Set --> Set
becomes a monad. Likewise for rings, replacing \times with
\otimes: Ab --> Ab.
----------------------------------------------------------------------
Concerning the examples of adjoints given after Exercise 8.3:6,
you write
> Many of our universal constructions create some sort of "free object"
> (free group on a set, free ring on a monoid, "free ring" (tensor ring)
> on an abelian group, etc.). ...
Well, people use the word "free" with various degrees of generality.
The formal definition we have given corresponds to a left adjoint
of a set-valued functor, and does not cover things like the monoid
ring on a monoid, or the tensor ring on an abelian group. On the
other hand, using the term still more widely, one often calls a group
etc. presented by generating set X and relation-set R "the group
freely generated by elements x of X subject to the relations R",
though this does not correspond to an adjoint functor, since only one
object and not a family are being constructed. Likewise, as noted in
the reading, the tensor product construction, though it gives an
abelian group "freely generated" by the image of a bilinear map on
given groups, is not a left adjoint.
So I would say that what you are referring to as "free" objects could
be described by the term "left universal constructions", as in the
text; and that a large class of these, but not all, are covered
by the concept of left adjoint constructions.
----------------------------------------------------------------------
You ask how, two paragraphs before Exercise 8.3:7, I translate the
question of whether the product functor C x C --> C has a left
adjoint, into the universal-object question on the next page.
I am essentially using the characterization of adjoint pairs of
functors in Theorem 8.3.8(ii), in which one starts with the right
adjoint U, and characterizes the left adjoint F object-by-object
in terms of it. So if the product functor is to be a right adjoint,
it will have the role of U, and the "C" and "D" of the theorem
will be our C and C x C respectively; and the desired condition
is that for every object of C (which I call X in the discussion
your ask about) there should exist a pair (R_X, u_X) representing
the functor C(X, U(-)). Writing R_X as (Y,Z), this means a pair
of objects Y and Z with a universal morphism of X into Y x Z.
----------------------------------------------------------------------
Regarding Definition 8.3.9 you asked,
> If we have three functors F,G,H, where G is a right adjoint of F
> and a left adjoint of H, is there anything interesting we can say
> about F and H? For instance, in the example you give with products,
> coproducts, and the diagonal functor, the product and coproduct
> functors are essentially dual to each other.
That the product and coproduct are "essentially dual" just means
that they are obtained by dual constructions -- left adjoint and
right adjoint -- from the diagonal functor, which has no left/right
asymmetry. It refers to the way we look at them, not to properties
that they will have in a particular category.
But the question of whether, when a functor has a left and a right
adjoint, those two adjoints have properties relating them, is
interesting. I don't know the answer, but I would guess that some
properties of G, concerning what distinctions it preserves and what
distinctions it loses, would force some dual properties on F and H,
which they would share. But I don't know anything concrete about
this.
Alexey Zoubov has pointed out to me a result along these lines:
F is full and faithful if and only if H is full and faithful.
This is Lemma 1.3 in "Exponentiable morphisms, partial products
and pullback complements" by Roy Dyckhoff and Walter Tholen,
Journal of Pure and Applied Algebra, 49(1987) 103-116,
https://doi.org/10.1016/0022-4049(87)90124-1
----------------------------------------------------------------------
Regarding the p-adic numbers (section 8.4) you write
> I am somewhat familiar with the Hasse local-global principle, or at
> least the result, that an equation solvable over all p-adics and over
> the reals has a solution over the rational numbers. Does this result
> arise in any way from the sort of constructions we have encountered?
> I'm curious if the p-adic rings, themselves inverse limits, have any
> similar relations which are useful in proving such results.
Well, there's an obvious approach to looking for necessary and
sufficient conditions for something to hold: One puts together
all the necessary conditions one can come up with, and hopes that
when one has listed enough of them, their conjunction will be sufficient
as well, and that one can express that conjunction in some concise
form. Necessary conditions for an equation to have a solution
in the integers are that it not contradict anything one can deduce
either using congruences, or using inequalities. Consistency with
what one can deduce using congruences comes down to solvability in
each Z/nZ. But Z/nZ is isomorphic to the direct product of
the Z/p^i Z for p^i ranging over the maximal prime-powers
dividing n; so having solutions modulo all integers is equivalent
to having solutions in all rings Z/p^i Z. For fixed p, the
conditions of having solutions in Z/p^i Z for various i are
not independent, due to the homomorphisms (8.4.3); but the conjunction
of these conditions over all i is equivalent to the existence of
a solution in the inverse limit of (8.4.3), i.e., the p-adics. So
the conditions one can get using congruences can be concisely
summarized by saying "there exist solutions in the p-adics for
each p". On the other hand, the study of the equation via
inequalities comes down to asking whether it has a solution in
the reals. The principle you describe evidently says, "Yes,
congruences and inequalities are together enough to tell whether
an equation has a solution."
Our construction of the p-adics can be thought of as a way of
condensing the information about integers arising from the study
of congruences. As we noted, the p-adics have no zero-divisors,
though each Z/p^i Z does; so this way of condensing information
brings a kind of elegance.
(You state the principle for solutions over the rationals rather
than the integers. I guess for that case one would look at
"congruences of rational numbers modulo fractional ideals", and use
the p-adic field in place of the p-adic integers. By clearing
denominators, I think questions of solutions in the rationals can be
reduced to questions of solutions in the integers, and the use of
the p-adic field reduces the use of the p-adic field to the use of the
p-adic ring.)
----------------------------------------------------------------------
Regarding germs of functions, mentioned in the first paragraph of
section 8.5 as a motivating example for direct limits, you
ask why "germs" are so called.
My guess is that the term arose in complex analysis. There, if
one knows an analytic function in a neighborhood of a point, no
matter how small, that determines uniquely its extension to as
large a connected domain as it can be defined on. The original
meaning of "germ" is "sprout" (what one gets when a seed germinates);
so I think the idea was that a "germ" of an analytic function was
a tiny thing that had enough information to determine the whole
thing.
When one considers general continuous functions instead of analytic
functions, the analogous entities no longer have the property of
determining the value away from the point in question, but the word
was probably carried over because the concept was useful. (And
because the everyday sense of "germ", which refers to a microscopic
entity, made it seem natural.)
----------------------------------------------------------------------
I meant to work the answer to your question into my lecture,
but didn't get to it. You asked whether there is a notion
of convergence associated with the direct limit.
I think that the main connection is conceptual -- that if X
is the direct limit of a family of objects X_i, then the
successive X_i are "more and more like X". E.g., in the
case of a direct limit of groups, they in general have more
and more of the elements that will show up in X, and these
satisfy more and more of the relations that are satisfied there.
But it may be possible to turn this into topological convergence.
If we define a language in which there are symbols for all the
elements of X, and for the kinds of relations that these can
satisfy, then we might define a topology on any set of objects
of the indicated sorts, some elements of which are labeled with
some of the element-symbols, taking for a subbasis of open sets
the sets of objects characterized by having (or not having) an
element symbolized by each symbol, and satisfying (or not satisfying)
a given relation on such elements; and then I think the direct
limit X would be the topological limit of the system of "points"
X_i.
(In an abstract category, we might do something similar using
the existence of maps from various objects in the category
satisfying various composition relations; but I haven't thought
this through.)
----------------------------------------------------------------------
You ask whether, if a functor F and a subfunctor G of F both
have a limit (Definition 8.6.1), the limit of G will be a subobject
of the limit of F.
This works for Set-valued functors, with "subobject" understood to
mean "subset"; hence it works for categories of algebraic objects
where limits can be constructed using limits of underlying sets.
However, using different choices of what to call subobject, it
can fail. E.g., if in Group we choose to define a "subobject" to
be a subgroup of finite index, then it fails for infinite direct
products.
----------------------------------------------------------------------
Perhaps in connection with Mac Lane's use of "complete", mentioned
in the sentence before Exercise 8.6:2, you ask
> For any category, can we construct a "completion" category
> with all limits?
There are such constructions, but I'm not familiar with the
details. One obvious approach is to start with the Yoneda
embedding of C in Set^{C^{op}}, note that, like any
category of the form Set^D, the latter has limits, and
close the image of C in that category under such limits.
Another is to take a category whose objects are formal limits
(one for each diagram whose limit one wants to allow), and let
it have just those morphisms that the universal properties of
limits require. Whether these construction would give the
same result, I don't know.
A problem is that a category may already have some limits,
but our construction might create new limits that don't
agree with these. For example, the category Set clearly
has inverse limits; but suppose we embed Set in the
category HausTop of Hausdorff topological spaces by giving
each set the discrete topology. Now consider the system
of sets which, for convenience, I will write as
... -> Z/8Z -> Z/4Z -> Z/2Z -> 0. (We're not interested in
the group or ring structure; just in the fact that as one
goes back a step, each point bifurcates.) In HausTop, its
inverse limit is the Cantor set, a compact space. In Set,
its inverse limit is the same space with the discrete
topology. If we construct a universal completion of Set,
then since HausTop is complete, the inclusion of Set in
HausTop will induce a functor from our universal completion to
HausTop which maps the inverse limit of the above system to the
compact Cantor set, while since the set-theoretic Cantor set was
already in Set, this will be mapped to the discrete Cantor set.
Hence the limit of the above system in the universal completion
will not be the same as its limit in the original category.
Of course, one might choose to construct a completion designed
to preserve those inverse limits that already existed, and which
would be "universal" only for functors that preserve such limits.
Looking online, I see that Grothendieck defined a completion
"pro-C" for every category C. (For the "pro-", see paragraph
before Ex. 8.5:18.) See also http://ncatlab.org/nlab/show/completion
and http://mathoverflow.net/questions/59291/completion-of-a-category
At some point I'll have to learn more about these things, for a
paper I've been putting off writing for years. But not during a
semester when I'm teaching.
(Incidentally, where you wrote "limits", I'm not sure whether you
meant inverse and/or direct limits in the sense of section 8.5,
or the more general limits and/or colimits in section 8.6; but
similar considerations should apply to both, though the details
of the category one got would differ.)
----------------------------------------------------------------------
Regarding Proposition 8.6.3, you note
> ... You write C^D(\Delta(-), F) : C^{op} --> Set, but why need the
> target be sets? Is it assumed that C^D is legitimate?
The C^D(\Delta(X), F) will be sets, all right, but I agree that
they may not be small sets, unless we assume D small. Thanks;
I'll put a note about this on the errata page.
----------------------------------------------------------------------
Regarding Exercise 8.6:6(ii) you ask how an initial object,
defined in terms of morphisms to other objects, can be characterized
as a limit, whose universal property asserts the existence of morphisms
from other objects.
If you look at the definitions of limit and colimit, you will see
that each of these involves morphisms both into and out of the
object in question. (Those morphisms have different roles in the
definition; so the exercise requires relating morphisms that have one
role in one definition to morphisms having a different role in the
other.)
----------------------------------------------------------------------
Meant to answer your question in class, but I fell behind schedule
in covering the earlier material.
You asked, in the context of Theorem 8.8.9, whether one
of the composite colimits in (8.8.10) could exist, and the other
not exist.
Yup. The example I've come up with is a direct limit, indexed
by the natural numbers, of coequalizers in the category of finite
sets. Let the n-th coequalizer diagram have for domain the integer
n = \{0,...,n-1\} and for codomain n+1 = \{0,...,n\}, and let
the two maps from the domain to the codomain be i |-> i and
i |-> i+1. You should check that for each such diagram, the
coequalizer is a 1-element set. On the other hand, let the n-th
such diagram be mapped into the n+1-st by inclusion of domains and
inclusion of codomains. Then if we take the direct limit over
n of either domains or codomains, the result "wants to be" the
set \omega, but this doesn't lie in FinSet; and it is not hard
to show that no object of FinSet has the universal property of
the desired direct limit; i.e., it does not exist; so of course
there's no way to construct the coequalizer of these (nonexistent)
direct limits. On the other hand, if one first takes coequalizers,
getting 1-element sets, one sees that a direct limit of these exists,
namely the 1-element set; so in that order, the colimit of colimits
exists.
----------------------------------------------------------------------
You ask about the second sentence after Theorem 8.8.9, saying
that if we assume that all functors from D to C have colimits,
then the isomorphism between the left and right expressions
in (8.8.10) becomes a case of Theorem 8.8.3.
I see that that does need a bit of explaining!
The first assertion of Theorem 8.8.3 says that if F: C --> D
is a left adjoint functor, and S: E --> C has a colimit, then
lim_E FS = F(lim_E S). (I will use "lim" in this e-mail for
the colimit symbol, "lim" with an arrow "-->" under it.)
In the application we want to make, the roles of C and D in
that theorem are taken by C^D and C respectively, the role
of F by lim_D, and the role of S by B(-,-), regarded as
a functor E --> C^D. ("E" is the one symbol that translates to
itself!) Then the formula lim_E FS = F(lim_E S) takes the form
lim_E lim_D B(-,-) = lim_D lim_E B(-,-).
(When I have time, after the end of the semester, I should add
a clarification of this to the online revised version of the
text.)
----------------------------------------------------------------------
You ask why I didn't simplify Definition 8.8.1 by putting
Definition 8.8.13 before it.
The main reason was that I felt the details of Definition 8.8.1
emphasized the concept; and I feel that it is often good to have
to deal with a concept "by hand" before introducing the machinery
that handles it slickly -- one appreciates the machinery, and has
a better sense of what it does.
There's also a technical reason. The morphism of Definition 8.8.13
only makes sense if we assume both (co)limits exist. But if we merely
assume Lim S exists, then the cone of Definition 8.8.1 exists.
In that way, 8.8.1 is simpler than 8.8.13: it simply merely assumes
Lim S exists, and the condition it gives is that F(Lim S) be the
desired limit of FS.
----------------------------------------------------------------------
In connection with the observation in the first paragraph of
section 8.9 that direct limits commuting with products is what allows
us to make a direct limit of algebras an algebra, you point out that a
family of operations on a set can be regarded as a map from a coproduct
of products of the set into the set, and you ask whether the fact
that direct limits respect coproducts is useful in this connection.
Interesting question. I don't see it as directly important -- it
seems easiest to just regard each of the family of operations as
carrying over to a direct limit. But if we consider a construction
that does not respect coproducts, such as that of direct product, we
get some useful insight.
Consider, for utmost simplicity, algebras consisting merely of a
set with two unary operations, \alpha and \beta. If X and
Y are two such structures, then the direct product of their
underlying sets, |X| x |Y|, can be made an algebra of the same
sort in the obvious way, writing \alpha(x,y) = (\alpha(x),\alpha(y))
and \beta(x,y) = (\beta(x),\beta(y)). But regarding the combination
of \alpha and \beta as maps X \coprod X --> X and Y \coprod Y --> Y,
we see that together they induce a map
(X \coprod X) x (Y \coprod Y) --> X x Y i.e.,
(X x Y) \coprod (X x Y) \coprod (X x Y) \coprod (X x Y) --> X x Y,
i.e., four rather than two unary operations on X x Y, which turn out
to be (x,y) |-> (\alpha(x),\alpha(y)), (x,y) |-> (\alpha(x),\beta(y)),
(x,y) |-> (\beta(x),\alpha(y)), (x,y) |-> (\beta(x),\beta(y)).
This kind of phenomenon is of interest -- e.g., if M and N are
R-modules, though we usually make M x N an R-module, sometimes
we prefer to make it an (R x R)-module instead.
----------------------------------------------------------------------
> In proposition 8.9.3, it says that "In Set, direct limits commute
> with finite limits." This commutation should be understood as 'up to
> isomorphism', correct?
Well, limits and colimits are only defined up to isomorphism -- they
are objects with universal properties, and if one object has the
universal property, then anything isomorphic to it does. It is true
that if we have specific constructions of the limit and colimit in one
order and in the other, then the "commutativity" statement means that
the two resulting objects are isomorphic (by an isomorphism that
respects the appropriate structure).
See Definition 8.8.1, last sentence and then first two sentences.
----------------------------------------------------------------------
Regarding Proposition 8.9.3, you ask
> ... what other important categories have this nice property
> that directed colimits commute with finite limits?
Because finitary algebras define their operations using finite
direct products (a finite limit construction), we will be able
to prove in the next chapter, when we study the general concept
of algebra, that direct limits of finitary algebras arise from
direct limits of their underlying sets. That will be our main
use of this result; but that fact can then be used to prove that
the result you ask about holds for any variety of finitary algebras.
----------------------------------------------------------------------
> In proposition 8.9.11, you write of categories being
> generated by a set of morphisms. Is this more generally a useful
> way of thinking about a category, or is it mostly just the condition
> which makes the proposition true? ...
I think it can be useful when the category is used to "index"
something; e.g., when it is a diagram category over which we will take
a limit or a colimit. For instance, of the two categories illustrated
by centered displays in the paragraphs following Exercise 7.2:1, the
first one is generally pictured without showing the diagonal arrow,
because that is the composite of other arrows, and the second one
is shown (there and in general) without showing the composites of
the short arrows, for the same reason. And the point is not merely
that one gets a less cluttered picture, but that in defining a cone
to or from such a category, it is enough to make sure the arrows
of the cone make commuting triangles with the generating morphisms;
it then follows that they make commuting triangles with all morphisms.
Also, people sometimes look at how results on monoids can be
generalized to results on categories; and how results on rings
can be generalized to results on Ab-categories; and since
generation properties are of interest in those fields, they should
be of interest in the category-theoretic generalizations.
However, I have not seen the concept used except by myself -- in this
section, and a paper I wrote based on it.
----------------------------------------------------------------------
You ask about the statement preceding the display (8.9.5),
that by the construction of direct limits in Set, there exist
D(i)\in P and an element x_i satisfying that display.
Well, we have just shown that the left-hand side of that display
lies in the direct limit over D of the sets B(D,E_i). The
construction of direct limits in Set describes that direct
limit as an image of the disjoint union of the sets B(D,E_i).
What I don't say in Lemma 8.5.3, but is implicit, is that the maps
of the given sets into that image constitute the coprojections; in
this case, t he q(D,E_i). So the desired element must be the image
of some element of some set B(D,E_i) under its coprojection. For
each i, we can name the "D" indexing that set as D(i), and we
can name the element in question in that set x_i. That gives
the display you asked about.
----------------------------------------------------------------------
Regarding the lines preceding Corollary 8.9.9, you ask what is
meant by a good finite subset.
The word "good" is in quotation marks, so I am not using it in any
standard meaning. Rather, by a "good" finite set I mean a set that
satisfies some properties that will be helpful in getting the desired
conclusion. What those properties are is seen in the statement of
the Corollary.
----------------------------------------------------------------------
You ask about the background of the term "solution-set condition"
(introduced two paragraphs before Lemma 8.10.3).
Well, if you want to describe, say, the group presented by a
set X of generators and a set R of relations, then the functor
that you want to represent is the one associating to every group
the set of X-tuples of its elements which are solutions to the
system of equations R. Moreover, the condition that is needed
to prove that such a representing object exists is, in classical
language, that a *set* of groups with such X-tuples exists that is as
good as the *class* of all such groups and tuples. (In our language,
for "set" and "class" read "small set" and "large set".)
The above is my guess as to the background of the term. Unfortunately,
mathematicians seldom say what leads them to choose a their terminology.
It could, rather, be more like what you suggested: a (small) set that
is a "solution" to the requirements of the proof.
----------------------------------------------------------------------
You ask (if I understand correctly) whether one can get the uniqueness
of adjoints using the construction of Theorem 8.10.5.
I don't think so. One of the marvelous facts about objects with
universal properties is that one can obtain them in different ways,
yet the different objects so obtained must be naturally isomorphic.
But the ways of constructing them don't give those isomorphisms;
that comes from the universal property.
But your question adds the words "or other characteristics of
possible adjoints"; and to this, one can sometimes give a positive
answer. For instance, though no one in this class has yet seen
what natural property the groups of (3.3.3) and Exercises 3.3:1-2
are universal for, once one does, the method of construction as
a subgroup of a direct product does give one a bound on the order
of the groups with that property. And studying which elements of
the direct product lie in the subgroup, one can further improve
that bound.
----------------------------------------------------------------------
Pointing to the contrast between Exercises 8.10:6 and 8.10:7, you
ask about general results on universal constructions in classes of
algebras with large sets of operations.
I don't know whether such results have been looked at. In General
Algebra, it is most natural to consider algebras with small sets
of operations. Complete lattices are most naturally thought of as
having meets and joins of arbitrary subsets; but operations on
subsets don't fit the techniques of General Algebra. The best way
one can accommodate them is to treat them as "large" families of
operations on tuples of elements. As such, they are more like the
kinds of structures studied in General Algebra than sets with
operations on subsets would be, but they still push the envelope.
I think "random" sorts of algebras with large classes of operations
will "almost never" have free objects; it is only certain sorts
that arise in special ways -- such as complete semilattices --
that have these.
----------------------------------------------------------------------
You ask why objects characterized by right universal properties are
usually easier to construct directly than those characterized by left
universal properties, as mentioned near the end of section 8.10.
I think it is because the objects we look at in algebra are defined
using maps on direct products of underlying sets, and direct products
respect right universal constructions. As a result, right universal
constructions on algebras turn out to be based on the corresponding
constructions on their underlying sets.
----------------------------------------------------------------------
You ask about the discussion in the last two paragraphs of section 8.12
where I state that the concept of "Cat-based category" is invariant
under reversing order of composition.
First, let me be explicit about what this means: It means that
if C is a Cat-based category, then C^{op}, defined by letting
C^{op}(P,Q) = C(Q,P), is also a Cat-based category if one takes the
same concept of morphisms-among-morphisms that one had before.
(It also becomes a Cat-based category if one reverses the directions
of morphisms-among-morphisms; but let's just look at one modification
at a time.) The point is that if one has a concept of composition,
given by morphisms C(Q,R) x C(P,Q) -> C(P,R), then one can regard
these as morphisms C^{op}(R,Q) x C^{op}(Q,P) -> C^{op}(R,P), and
the left-hand side can be rewritten C^{op}(Q,P) x C^{op}(R,Q),
giving precisely the kind of map one needs to define a category
structure C^{op}.
As for the composition of morphisms among morphisms, this is based
on the "op" functor, which goes from Cat --> Cat, not from
Cat to Cat^{op}.
Anyway, I suggest that to get the idea without the complications
of Cat-based categories, you look at the very last (3-line) paragraph
on the page, which talks about the "op" construction on ordinary
(Set-based) categories, and note that the opposite of an ordinary
category is again an ordinary category, not a Set^{op}-based category.
----------------------------------------------------------------------
Regarding the last paragraph of section 8.12, you ask "why do we
need the product of Set to define "op" functor?"
We use the product construction in Set in defining the concept of
category, since "composition" is defined by maps C(Y,Z) x C(X,Y) ->
C(X,Z) (where "x" here stands for "direct product of sets"). So
when we define the opposite of a category, we have to see how to take
such maps defined on product-sets of C 's hom-sets and turn them into
maps on product-sets of C^{op} 's hom-sets. To fit the changed domains
and codomains of our maps, this turns out to require reversing the
order of "C(Y,Z)" and "C(X,Y)", and that uses the symmetry of the
direct product of sets.
----------------------------------------------------------------------
You ask, regarding Proposition 9.1.6, "What goes wrong in the
construction of colimits of algebras?"
To answer that, I need some idea of how you think they would be
constructed. So please let me know that, and I'll reply!
----------------------------------------------------------------------
Regarding Lemma 9.2.2, you ask whether, when \gamma is singular, the
sequence of subsets S^{(\alpha)} always continues to grow past
S^{(\gamma)}.
Certainly not! For the easiest case, one can start with X = A;
and then all S^{(\alpha)} are equal to A, i.e., the growth
stops at the first step. By other constructions, one can obtain
examples where the growth stops at any specified ordinal less
than or equal to the value given by Lemma 9.2.2.
----------------------------------------------------------------------
Regarding Exercise 9.2:5, you ask,
> ... what does compact as an element of the lattice of subalgebras
> of A mean?
Did you look up "compact" in the index? If, after doing so, you
still have a question about it, let me know and I'll try to help.
----------------------------------------------------------------------
Regarding the concepts of identity and variety (Definitions 9.4.1
and 9.4.6), you ask
> ... Could we interpret \Omega-algebras satisfying sets of identities
> as structures satisfying certain first-order theories in languages
> possessing only function symbols in their signatures? ...
It's not only the signature that has to be restricted, but also
the syntax: No existential quantifiers, no negation, no implication,
no disjunction. Just universally quantified equations. (One could
allow conjunction, since having a conjunction in one's theory is
just equivalent to having each of the conjoined identities; but
esthetically it seems nicest to leave conjunction out here.)
Certain slightly more general languages give classes of algebras that
can also be treated nicely. For instance, if one allows, along with
sentences of the preceding sort, universally quantified sentences
of the form (conjunction of equations) => (equation), then one
finds that one still gets free algebras and algebras presented by
generators and relations, and that a class of algebras defined
by such sentences is closed under almost all (but not quite all)
of the operations discussed in today's reading. Such a class is
called a "quasivariety of algebras", and these are also studied in
universal algebra. An example of a quasivariety is the class of
torsion-free groups; another is the class of rings without nonzero
nilpotent elements (elements satisfying x^n = 0 for some n).
----------------------------------------------------------------------
Regarding section 9.4 you ask
> What sort of questions in specific theories like group theory or ring
> theory do the results of this section help us to answer or rephrase
> in a more manageable manner?
None occur to me. I would say the question is like asking, "What
sorts of problems about the complex numbers does the concept of a
field help us answer?" The value of that concept is to generalize
from specific structures, like the complex numbers, general properties
that can be studied in a much larger class of cases. Results can then
be -- and are -- proved about fields in general, and applied to the
complex numbers in particular; but those results could, in principal,
have been proved for the complex numbers alone, if we were sure that
that field was all we would ever care about.
Our goal from the beginning of the course was to see what was the
common context in which results for varied classes of algebras such
as groups, rings, etc., regarding free objects, construction by
generators and relations, consideration of subclasses determined by
identities, etc. could be proved. And we have just done that.
The results of this reading may look dull, because they are things
that we already knew for the specific classes of algebras that we are
familiar with. In subsequent sections, we shall get past these "dull"
basics, and the material developed will hopefully look more interesting.
----------------------------------------------------------------------
Regarding section 9.4 you note,
> In chapter 1, one exercise gives an alternative formulation of the
> concept of group: we can get away with just the one 2-ary operation
>
> delta(x, y) = xy^-1.
>
> This gives us two varieties that are basically the same: ...
Actually, the description of groups in terms of the operation delta
does not quite give a variety; if one just uses identities, then the
variety one gets consists of structures corresponding to groups, and
also an empty structure. The category of groups is equivalent to the
subcategory of nonempty algebras in this variety.
However, that quibble (which one can get around by throwing into the
above variety a zeroary operation e and an identity delta(x,e)=x, or
delta(x,x)=e) doesn't invalidate the point you make:
> ... We could say the varieties are isomorphic as categories, but
> that doesn't say it all, so is there a term for this "stronger than
> isomorphism but not quite equality" of varieties?
I'm not sure whether there is a standard term. In the language of
sections 9.9-9.10, which we will read in about a week, the relation
is that the two varieties of algebras have isomorphic "clones of
operations". We will see that two varieties are related in this
way if and only if there is an equivalence between them that
respects underlying-set functors. Birkhoff named two algebras
(as distinct from {\em varieties} of algebras) of possibly different
types "crypto-isomorphic" if (in the above language) their clones of
operations were isomorphic, which is equivalent to saying that the
varieties they generate are equivalent in this way, via an equivalence
that takes one algebra to the other.
----------------------------------------------------------------------
Regarding section 9.4 you ask,
> ... Is there a general way in \Omega-Alg to express the things
> where a property exists for some element. For example the groups for
> which there is an element of order 5, but not necessarily satisfying
> the identity x^5=e for all x?
Well, elements g\in G satisfying g^5 = e correspond to morphisms from
to G. Such an element will have order 5 if the morphism
does not factor through the natural map to the trivial group. So
one can describe groups having elements of order 5 by the condition
that they will have such a non-factoring morphism from . But
it isn't a very natural class of groups -- it isn't closed under
homomorphic images or subalgebras; and in general, one can't do
universal constructions in it. Perhaps the fact that one can describe
it is what you are looking for; but giving such classes a name isn't
likely to be helpful.
There are two directions one can go from there to get more natural
classes. One is to look at the category consisting of groups with
a distinguished element g satisfying g^5 = e (dropping the
requirement that this element not equal e). This is essentially
a variety of algebras: We take the description of the variety Group,
and adjoin one additional zeroary operation, specifying the
distinguished element, and one additional identity, saying that
the new element should have exponent 5. Homomorphisms between such
structures should, of course, take distinguished element to
distinguished element.
The other is to to consider groups _not_ having any elements of
exponent 5. That is not, I think, equivalent to a variety; it is
defined by operations, and identities, and the additional condition
(\forall x) (x^5 = e) => (x = e). Classes of algebras defined
by such conditions -- identities together with universal equational
implications -- are called "quasivarieties", and they have properties
nearly as nice a varieties. This quasivariety is really the opposite
of what you asked for (groups with an element of order 5); but perhaps
you'll find it as interesting as what you asked for.
I guess there's a third answer to your question. Model theorists
consider any sort of family of first-order sentence, and look at
the class of "models" of that family, and call them axiomatic model
classes. So if we let the family be the identities for groups,
together with the sentence
(\exists x) (x^5 = e) and not-(x=e)
then the resulting axiomatic model class is what you are asking about.
But in considering such classes, model theorists are pretty far from
universal algebraists.
----------------------------------------------------------------------
You ask whether one has an analog of Cayley's Theorem for objects
of an arbitrary variety of algebras (Definition 9.4.6).
The examples where we had "Cayley's Theorem" type results were all
for classes of structures that were defined as models of certain
sorts of mathematical phenomena; and in those cases, it turned out
that the sorts of structures so defined modeled the phenomena in
question sufficiently nicely that any structure fitting our definition
could be realized in the way that motivated the concept.
Though the majority of the varieties that algebraists study are models
of general sorts of phenomena, an arbitrary variety is not required
to be motivated in such a way; so there is no natural formulation
of a "Cayley's Theorem".
Even when a variety is so motivated, the characterization may not
be good enough to give a "Cayley's Theorem". The result of this
sort for Lie algebras is the Poincare-Birkhoff-Witt theorem, and
this shows that every Lie algebra over a field can be represented
by commutator brackets in an associative algebra over the same
field; but as noted in the last sentence of Exercise 9.7:2, this
is not true of Lie algebras over an general commutative rings.
----------------------------------------------------------------------
You ask about the terminology used in display (9.5.4), in particular,
the terms "model" and "first-order theory".
Model Theory studies structures consisting of a set given with a family
of relations of various arities on it, some of which may be operations,
and statements about it which can be expressed by formal sentences
constructed using element-symbols, relation symbols corresponding to
the given relations, and logical operations such "implies", "for all",
"there exists", etc..
A "first-order sentence" allows these operations, but with "for all"
and "there exists" applicable only to elements, not to relations.
(So a statement like "X is infinite", though it can be expressed by
saying "there exists a set of pairs of elements which satisfies the
conditions to be a function X -> X that is one-to-one but not onto",
is not equivalent to a first-order sentence.) Given a language
involving certain relation-symbols, a "model" for that language is a
set given with operations corresponding to the operation-symbols of the
language; and the "theory" of a model or family of models is the set of
those statements in the language which are true for all these models.
A model of a theory is any model which satisfies all the sentences in
the theory.
Our concept of a variety is a restricted case of these concepts:
The only relations we consider are operations, and the only sentences
we consider are universally quantified equations (so, nothing with
"not", "or", "there exists", etc.)
----------------------------------------------------------------------
Regarding the discussion of Vopenka's principle following (9.5.4),
you ask what the "special properties" of the cardinal mentioned in the
discussion there are.
I'm afraid you'd have to ask a logician that! Sorry.
----------------------------------------------------------------------
> Does the term "derived operation" (definition 9.5.1) have anything
> to do with derived functors and derived categories? ...
I don't think so. Vague everyday words like "normal", "derived",
"regular", etc. get borrowed over and over again into mathematics,
with generally unrelated meanings.
----------------------------------------------------------------------
In connection with Proposition 9.6.3(iv), you ask why the endomorphism
extending the set map v is not assumed to be unique.
The condition "F is generated by the image of X" forces uniqueness.
(In this context, it is strictly stronger than uniqueness: E.g., in
the situation of Exercise 9.6:4(i), uniqueness holds, but the monoids
in question are not free objects in a _subvariety_ of Group. So
since we have to state the stronger condition of being generated by
v(X), there's not point in bringing in the implied condition of
uniqueness of extending homomorphisms.)
----------------------------------------------------------------------
In connection with Exercise 9.6:9 and the following discussion, you ask
> Is it easy to find examples of rings that satisfy S_{2d+1} = 0 but
not S_{2d} = 0 ... ? ...
Hmm -- fiddling around, I think that S_{2d+1} = 0 is equivalent to
S_{2d} = 0. Namely, if in S_{2d+1} we substitute 1 for, say, the
last indeterminate, and evaluate S_{2d+1}(x_1,x_2,...,x_{2d},1),
we find, I think, that those terms where the "1" appears in the
last position give S_{2d+1}(x_1,x_2,...,x_{2d}), those where it
appears in the next to last position give the negative of this,
those where it appears in the third from last position give the
S_{2d+1}(x_1,x_2,...,x_{2d}) again, and so on; and since there are
an odd number of positions, we get exactly S_{2d+1}(x_1,x_2,...,x_{2d}).
So the identity S_{2d+1} = 0 implies S_{2d} = 0.
> ... Is there a theory of associative algebras with an extra
> S_n-like operation, akin to Lie algebras?
Not that I've heard of.
> ... Is "S_4 = 0" the simplest nontrivial identity satisfied by the
> 2 by 2 matrices?
I believe it is the polynomial identity of lowest degree; but
an identity that one may find conceptually simpler is
(XY-YX)^2 Z = Z (XY-YX)^2; i.e., the square of every commutator
is in the center. This can be proved for matrices over the complex
numbers by noting that XY-YX has trace 0, and verifying that
every matrix with trace 0 has square a scalar matrix. (That fact
in turn can be "seen" from the fact that among matrices with a
given trace, in particular among those with trace 0, the diagonalizable
ones form a dense subset, and it is clear that the square of a
trace 0 diagonalizable 2x2 matrix has scalar square.) Knowing the
result for matrices over the complex numbers, and the fact that the
field of complex numbers generates the variety of commutative rings,
one can deduce that the identity holds for matrices over any
commutative ring.
----------------------------------------------------------------------
You asked about the fact, mentioned following Exercise 9.6:10,
that the concept of "heap", developed in that exercise, had
being rediscovered many times, and what could have led to that
repeated rediscovery.
I think that two contrasting aspects of the concept of heap are
relevant: That it is fairly natural and moderately useful, but that it
does not invite study for its own sake, since the isomorphism classes
of heaps (other than the empty heap) are completely determined by
the isomorphism classes of the corresponding groups. Naturalness and
usefulness leads people to discover the concept, but since the theory
completely reduces to that of groups, not many works get written about
them, and no one ends up specializing in them; so few people hear about
the concept, and if they need them, they often end up reinventing them.
When I rediscovered them I called them "isogroups"; see
p.60 of https://link.springer.com/article/10.1007%2FBF02188011 ,
line after display (5). After that appeared, someone told me that
they had already been defined.
The same situation -- being natural and useful, but having a
theory that essentially reduces to other theories -- is also true
of preorders. I don't know whether they too have been rediscovered
several times. If not, perhaps the situations in which they come
up are sufficiently widespread that once a name was assigned to
the concept, enough people heard of it to prevent the "rediscovery"
syndrome.
----------------------------------------------------------------------
You ask about the type \Omega of the variety of Lie algebras
(section 9.7)
The list of operations begins with the structure of k-module,
which has one 0-ary operation (the element 0), one binary
operation (addition), and a unary scalar-multiplication operation
for each element of k. (If one builds up the concept of
a k-module starting from that of an abelian group, then one
also has the unary operation of additive inverse, and I list
that in (9.7.2). However, that is equivalent to the operation
of multiplying by -1\in k, so it can be omitted, as I have
done above.) Finally, in addition to all of these, there the
binary operation of Lie bracket.
----------------------------------------------------------------------
Regarding the concept of a Lie algebra over a commutative ring
k (section 9.7), you ask
> ... What falls apart if k is allowed to be noncommutative?
There is no appropriate way to define a bilinear map on a module
over a noncommutative ring. In other contexts, the "correct"
noncommutative generalization of a bilinear map of modules is
to consider a right R-module M and a left R-module N, and
consider a map b: M x N --> A, where A is an abelian group,
and f(xr, y) = f(x, ry) for all y\in R. (More generally,
M can be an (S,R)-bimodule, N and (R,T)-module, and A
an (S,T)-bimodule, for rings S and T which may or may not
equal R, and one can add the conditions f(sx, y) = sf(x, y),
f(x, yt) = f(x, y)t for s\in S, t\in T.) However, condition
(9.7.5), i.e., [x,y] = -[y,x], means that we can't distinguish
"right factors" from "left factors" in the operation of a Lie
algebra; and there is no decent version of bilinearity of a map of
modules over a noncommutative ring that makes sense without such
a distinction.
But there is a concept that does combine a Lie algebra with
a noncommutative algebra; I will mention it briefly in class:
A "Poisson algebra" (over a commutative ring k) is a
k-module together with two multiplications, one associative
and one Lie, such that Lie bracket with every element is a
derivation on the associative algebra structure.
----------------------------------------------------------------------
You write
> In the middle of section 9.7, it is said there is a connection
> between Lie algebras and Lie groups. Does this connection allow us
> to identify an algebra with a (possibly) group or vice-versa?
As the sentence preceding Exercise 9.7:6 suggests, it is a many-one
relationship: The many different Lie groups that look alike in a
neighborhood of the identity have the same Lie algebra. For each
finite-dimensional Lie algebra L over the real numbers, there is a
unique *simply connected* Lie group G having L as its Lie algebra.
Other Lie groups having L as their Lie algebra can be obtained
by dividing G by a discrete subgroup, and/or throwing in
more connected components. For a trivial case: if L is the
1-dimensional Lie algebra, which by (9.7.3) has zero bracket operation,
then G is the real line R, while other Lie groups with the same
Lie algebra include the circle group R/Z, and various groups with
R or R/Z as the connected component of the identity element.
----------------------------------------------------------------------
In connection with the discussion following Exercise 9.7:5 of how
every Lie group gives a Lie algebra, you ask whether the reverse is
true.
Every finite-dimensional Lie algebra over the real numbers arises
from a Lie group (necessarily of the same dimension). That group
is not unique, but for any such Lie group, its universal covering
space is again a Lie group, and is the unique simply connected Lie
group, up to isomorphism, which gives the indicated Lie algebra.
For the infinite dimensional case, one would need to specify a concept
of infinite dimensional manifold to use in defining infinite dimensional
Lie groups. I know that people do work with such concepts, but there
are varied choices of definitions, and I don't know for what choices,
if any, such a result has been proved. Infinite dimensional Lie
algebras are perfectly natural algebraically, however, as one can
see from the part of this section up to where the relation with Lie
groups is introduced.
----------------------------------------------------------------------
You ask whether people study nonassociative algebras other than
Lie algebras (section 9.7).
Yes, but probably more work is done with Lie algebras than with
all other sorts of nonassociative algebras combined. The next
largest area is that of Jordan algebras, mentioned near the end
of this section. Another is "power-associative" algebras; i.e.,
algebras whose 1-generator subalgebras are associative; these
satisfy identities such as x(xx) = (xx)x.
I have a few papers in which results are proved for nonassociative
algebras simply because the arguments needed to prove certain things
for Lie algebras (which were what my coauthor in the first two of
those papers cared about) didn't really require the Lie identities.
The results in the last paper in the group have the curious property
that they hold for varieties of k-algebras whose identities have
certain properties, satisfied in particular by associative, Lie, and
Jordan algebras (and many more); but definitely not all such varieties.
I can give you copies if you're interested.
----------------------------------------------------------------------
Regarding the concepts of a clone of operations (Definition 9.9.1) and
a clonal category (Definition 9.9.5), you ask whether these concepts
have applications to other fields of mathematics.
The idea of a "clone of operations" is a generalization of specific
concepts like "the set of all derived group-theoretic operations",
"the set of all derived lattice-theoretic operations", "the set
of all continuous operations on a topological space", etc.. These
specific concepts are looked at in group theory, lattice theory,
topology, etc.. General Algebra (or Universal Algebra) is
the field of mathematics where one abstracts from these specific
situations and looks at what one can say about such things in
general; so from that point of view, the concept of clone "belongs
to" General Algebra. People in other fields may or may not find
it valuable to put the results they obtain about the objects that
they look at in such a more general context; insofar as they do,
they will find the language of general algebra useful.
Some theoretical computer scientists have been happy to adopt
concepts from Category Theory and General Algebra, including
that of a clone of operations. Whether the use they make of
these is in fact valuable, I don't know.
----------------------------------------------------------------------
You ask about many-sorted algebras, mentioned in the third paragraph
after (9.9.4). These are very much like 1-sorted algebras. Instead of
an underlying set, one has an S-tuple of underlying sets, where S
is the set of "sorts"; and the arity of each operation is a list
(with repetitions allowed) of the "sorts" of the arguments, and a
specification of the "sort" of the output. When one defines a free
algebra, instead of having a single free generating set, one has an
S-tuple of generating sets, so that the algebra is free on a family
of generators of specified sorts.
One of the most natural examples is given by graded rings. Such an
object (graded, let us say, by the natural numbers, for simplicity)
is usually described in ring theory as a ring R that is given with
a direct sum decomposition R = \sum_i R_i, the summands R_i being
called the "homogeneous component of degree i", subject to the
condition that any product of an element of R_i and an element
of R_j is an element of R_{i+j}. This definition is adequate,
but it is really most natural to regard the graded ring as a system
of abelian groups R_i with multiplication maps R_i x R_j --> R_{i+j}
satisfying appropriate identities.
In ordinary one-sorted algebras, people can (although I don't like
to) exclude the empty algebra, and make ad hoc definitions to get
around the difficulties this produces, without losing a nontrivial
amount of information about the theory of such algebras. But in a
many-sorted algebra, any subset of the set of sorts may be empty,
so that requiring that every sort should be nonempty loses a nontrivial
amount of information. (Though this is not illustrated by graded rings,
since the additive structure of each R_i leads to a zeroary
operation with output "0_i" in each R_i). There is an article about
some of the resulting complications in the theory, "The point of the
empty set" by Michael Barr, Cahiers Topologie GĂ©om. DiffĂ©rentielle
13 (1972), 357-368, MR 48 #2216, though I haven't read it. (From the
MR review, it requires the theory of "tripleability", which we haven't
covered.)
----------------------------------------------------------------------
You ask about the two definitions of hyperidentity -- the one
given before Exercise 9.9:14, and the one mentioned parenthetically
before Exercise 9.9:15.
These are very different concepts; it is unfortunate that they
are given the same name. I find the definition in which the
identities are assumed only for the primitive operations an
unnatural one: the choice of which operations to regard as the
"primitive" ones is just a matter of convenience when defining
a mathematical concept. (E.g., one _could_ define "group" using
only the operations (x,y) |-> x y^{-1} and e, and then that
operation would become primitive while multiplication would no
longer be.) So to require that "all primitive operations" satisfy
some identities seems very arbitrary.
> ... as far as I can tell, the distinction does not matter in
> Exercise 9.9:14.
It does. If we replace (a) by the statement that all primitive
unary operations are equal, this would not imply the same for
all derived unary operations. For instance in Group, there is
only one primitive unary operation (the operation of inverse), so
all primitive unary operations are equal; but the derived unary
operation x . x is not equal to the one primitive unary operation.
What you may have noticed is that the significance of (b) is
unchanged if we replace "primitive" by "derived". That is
true precisely because (b) is equivalent to condition (a),
which is a hyperidentity in the sense I use.
----------------------------------------------------------------------
Regarding the last paragraph of section 9.9, you ask
> Does this construction correspond in any way to an enriched structure
> on the category C?
I think that a functor \box of the indicated sort is the kind of
thing one uses in defining a "category over C", i.e., a category D
whose morphism-sets are objects of C, and whose composition operations
are morphisms D(Y,Z) \box D(X,Y) --> D(X,Z). So it's not C that is
an enriched category, but the resulting categories D. However, the
assumptions one has to make on \box for this discussion may be weaker
than those needed to define enriched categories.
I don't know much about operads -- I've just seen the concept
sketched, and in these paragraphs, I'm pointing the interested
reader to something that he or she might want to look into.
----------------------------------------------------------------------
Concerning the second line after (10.1.5), you write
> ... you say that $x, y \in |SL(n, A)|$ are the images of associated
> universal element $r\in |SL(n, R)|$ under homomorphisms
> $f, g : R --> A$. You mean V(f)(r) = x and V(g)(r) = y?
Strictly speaking, yes. However, here I am writing informally,
and given a homomorphism f of rings, and a matrix M over the
domain of f, one can speak of "applying f to M," meaning
that one applies it entrywise. Hmm, maybe if I change "under"
to "via" it would at least make clear that the reader has to
think about how x and y are obtained from f and g.
----------------------------------------------------------------------
Regarding the concept of cogroup, sketched at the end of section 10.1,
you ask
> Do such things as cogroups arise naturally in other contexts than
> looking at the existence of an adjoint?
This just depends on what one considers a "good motivation". I find
the question "which functors have adjoints?" to be of great interest,
so I motivate coalgebras in these terms. Alternatively, one can
say "Many of the most basic constructions of algebra, when regarded
as set-valued functors, turn out to be representable. Yet they also
often have algebra structures (e.g., the construction SL(n)). How
does such algebra structure arise?" As sketched in section 10.1, it
arises from a coalgebra structure on the representing objects.
----------------------------------------------------------------------
Regarding the development of V-algebra objects of a general category,
i.e., objects analogous to systems of algebras defined by operations
and identities, in section 10.2 (and also the fact that we developed
the theory of varieties, and not of more general classes of algebras,
in earlier sections), you ask
> ... Why do we stop at identities -- why not proceed to arbitrary
> predicate calculus statements, then to arbitrary \Pi_1 statements ...
Two reasons. The first was illustrated by the exercise in Chapter 2
showing that there do not exist free fields, and the exercise in
Chapter 4 to show that a commutative ring R does not, in general,
have a universal homomorphism to an integral domain. We are looking
for constructions with universal properties, and classes of models of
arbitrary sentences in the predicate calculus (e.g., the sentences
defining fields, and integral domains) do not typically admit these
constructions. The class of models of a family of universally
quantified equations, i.e., identities, does.
Secondly, not every sort of sentence in the predicate calculus has
a reasonable category-theoretic translation.
There are ways to remedy each of these problems. One can investigate
what sorts of sentences, other than identities, do yield classes of
models allowing universal constructions, and develop the appropriate
topics -- yielding "quasivarieties", "prevarieties", and related
concepts. And one can look for special classes of categories for
which one can define the analogs of models of general sentences,
leading into the theory of "topoi".
The first set of concepts would allow us to generalize the material
of this chapter; and if I ever find time to write further chapters,
I intend to introduce it; but they wouldn't be able to fit into the
scope of a 1-semester course (unless we left out a lot of other things
we've done). On the other hand, as to using a topos as the category
in which we define our algebra objects, this would be very restrictive.
E.g., the opposite of the category of commutative rings is not a
topos, so our expression of SL(n,-) in terms of a group object in
that category could not be expressed in this context. If we wanted a
context in which we could do these things, it would have to be one
which would exclude most of what I center this course around. This
could be a supplementary topic for a continuation of this course, but
not one that subsumes the what we have been doing.
Incidentally, I use the concepts of quasivariety and prevariety in a
recent preprint, http://math.berkeley.edu/~gbergman/papers/pv_cP.pdf ,
for which I recall the definitions (in section 2); you might find that
paper interesting to look at.
----------------------------------------------------------------------
Regarding the concept of algebra object in a category
(Definition 10.2.4), you ask how we will be using it, saying
> ... Even though I read the whole section, I am not grasping the point
> of having this concept...I feel like it is going back and forth.
Perhaps the following will help.
First think of algebra objects, not in terms of the question "How will
we be using them?", but as an answer to the question, "How can we
take the concept of an algebra, which is defined as an object of the
category Set together with certain additional structure, and get a
generalization with Set replaced by a more general category C?"
The answer is quite straightforward: In the usual concept of
algebra, we have a set |A| with some maps |A| x ... x |A| --> |A|,
which we call operations; so for the modified concept, we will
assume the category C has finite products, take an object of C
which we will call |A|, and define "operations" to be morphisms
|A| x ... x |A| --> |A| in C.
As for the material going "back and forth"; what you have to
see is why it does so. After we define the concept of an algebra
object in a category C, we find that it leads to a family of algebra
objects in the usual sense: Just as every object X of a category
C allows us to create a large family of sets, namely the hom-sets
C(Y,X) for the different objects Y, so we find that an "algebra A
in C" leads to a family of "algebras in Set", i.e., algebras in the
traditional sense, namely, the sets C(Y,|A|) with algebra structures
induced by the algebra structure of A, via the universal property
of direct products. But the algebra object A in C is not these
algebras; they can be thought of as its "shadows" in the category
Set. However, we can study it using these "shadows"; in particular, we
prove that A will satisfy the diagrammatic conditions corresponding
to any identities if and only if these "shadows" satisfy those
identities themselves.
----------------------------------------------------------------------
Regarding the concept of an algebra object in a general category
(section 10.2), you ask whether one can similar define a "poset
object", a "topological space object", etc..
I'll just address the case of a poset object. A difficulty in defining
such a struct in a general category is that a binary relation on a
set S is a subset of of the product S\times S, and there is no
canonical choice for what a "subobject" of an object of a category
should be. One can make definitions based on various choices -- taking
subobjects to mean domains of monomorphisms, or equalizer objects --
and see whether any of these have nice properties. Or one could see
whether one can characterize the "structure" (in an unknown sense)
that an object R of a category C needs to have to determine poset
structures on all objects C(R,X) in a functorial way.
I just tried Googling
"partially ordered object" category
and the results all seemed to be about toposes or cartesian-closed
categories. I haven't studied these, but I know that they are
classes of categories that behave much more like Set than most
categories do. This suggests that no one has found a good way
to generalize the concept of partial ordering to objects of a
"typical" category.
----------------------------------------------------------------------
Concerning Definition 10.3.5, you write
> ... I am almost sure this representability is not equivalent to
> the representability of section 8.
It's a generalization thereof. Part (ii) of definition 10.3.5 says
in effect "the functor must have the property that, if you forget
the operations, it gives a representable functor in the sense of
chapter 8". If V = Set, which can indeed occur, since sets are
just algebras with no operations, then there are no operations to
forget, and part (ii) then says that in this case -- the case to which
the definition of chapter 7 applies -- the definition agrees with that
of the chapter.
(This is like the question "does the definition of multiplication
of complex numbers conflict with the definition of multiplication
of real numbers?" No, because when restricted to real numbers, it
gives the same operation.)
----------------------------------------------------------------------
You ask why the assumption that C is a variety of algebras is
needed for condition (iii) of Definition 10.3.5 to be equivalent
to the other two.
If C is an arbitrary category with appropriate direct products
as in the first sentence of the definition, then "elements of A"
and "relations" satisfied by such elements aren't meaningful.
One might, of course, ask whether some weaker assumptions can
be made which would make those concepts meaningful. One can
certainly do that. If the conditions are too weak, the concepts
might be meaningful but the the equivalence could fail. For
instance, if C is the category of all finite groups, one
can still speak of elements and relations, but a given system
of elements and relations might not determine a finite group
so such X and Y might not determine a representing object.
(E.g., one element and no relations give in the variety of all
groups the infinite cyclic group, but don't determine any object
of the category of finite groups.) There are, however, conditions
weaker than that C be a variety of algebras, but strong enough
to make the equivalence hold. But to develop these would require
that we introduce further concepts, and for simplicity, I have
restricted the main focus of the text to varieties of algebras.
----------------------------------------------------------------------
Concerning Definition 10.3.5, you write
> I am not sure why Rep(C,V) is a full subcategory.
To define a subcategory Y of a category X, one must specify its
objects and its morphisms. If one begins by specifying the objects,
and also says it is to be a "full subcategory", this means that for two
objects that belong to Y, the morphisms between them in Y are to be
all the morphisms they have between them in X. Since the objects
of V^C are functors, when I say in the last sentence of Definition
10.3.5 that Rep(C,V) "consists of" the representable functors, I am
specifying the objects of the subcategory. Saying it is a full
subcategory says that the morphisms between two such functors in
Rep(C,V) are defined to be all the morphisms they have between them
in V^C.
(Did you understand the definition of "full subcategory" when
you asked this question?)
----------------------------------------------------------------------
Regarding the diagram in the proof of Theorem 10.4.3, you ask whether
it is usually easy to obtain a concrete form for G(A).
There's no "usually"! It depends on the categories involved.
(After all, these left adjoints are examples of universal
constructions, and we saw in Chapter 4 that general universal
constructions in groups, such as the description of groups
presented by given generators and relations, range from
very easy to very hard.)
----------------------------------------------------------------------
You ask whether Freyd's result, Theorem 10.4.9, is mostly used
with C a variety, as in our initial example of SL(n).
Within algebra, certainly. There are other categories of algebras
with small colimits; for instance, those defined by "Horn sentences",
i.e., implications such as "x^2 = e => x = e" in groups. (If we
take the implications "x^n = e => x = e" for all positive integers
n, we get the category of torsion-free groups.) But varieties are
more often studied. Outside of algebra, I don't know.
> Question 1: In practice, how difficult is showing that a given
> functor is representable in the sense of Definition 10.3.4?
For C a variety of algebras, it's usually pretty clear when
the hypotheses apply. Showing that a functor is not representable
can be trickier; but since we have just seen that representability
as an algebra-valued functor is equivalent to representability of
the set-valued functor gotten by passing to underlying sets,
Proposition 8.10.4 tells us what we should check. Again, outside
algebra, I can't say.
----------------------------------------------------------------------
Concerning the beginning of section 10.6 you write:
> ... Can we think of the multiplication in Monoid to be derived
> from the comultiplication? In other words, can we think of
> comultiplication to come before multiplication?
It will be easier to talk about the SL(n) example, because there
the two varieties, CommRing^1 and Group, are different, so when
I talk about "the ring operations" and "the group operations", you
will know whether I mean the operations of the domain or the
codomain of the representable functor.
In the SL(n) construction, the varieties CommRing^1 and Group
are defined first -- the definitions of those varieties use only
operations, as in all the preceding chapters of the book; no
co-operations. Given those two varieties, one considers a functor
from the first to the second. This is a construction that takes
for input any commutative ring A, and produces for output a
group SL(n,A). The operations of the group that we construct
are defined using the operations of the ring, and the way that this
is done is encoded in the co-operation on the representing object.
Our investigation of representable functors Monoid --> Monoid is
similar, except that instead of being given a known construction
like SL(n), and finding a way of encoding it using co-operations,
we are investigating all possible functors Monoid --> Monoid that
can be encoded in this way, i.e., that are representable.
> ... Even with the example of SL(n), I am not grasping
> what is meant by representing multiplication. ...
Well, let's go back a few steps. Make sure that you understand
the following points:
--> If V: C --> Set is a representable functor, with
representing object R, then for any object A of C,
elements of V(A) correspond to homomorphisms R --> A.
--> In the above situation, R has a universal element of V
of it -- an element of V(R) that can be sent to each
element of each set V(A) by a unique morphism R --> A.
--> In the above situation, R \coprod R likewise has a universal
ordered pair of elements of V of it.
We then see that:
--> If there is a binary operation which we can define in a
functorial way on the objects V(A) (A\in Ob(C)), then
by applying it to the universal ordered pair of elements of
V(R\coprod R), we get what can be considered a universal
instance of that operation. It will be an element of
V(R\coprod R), hence will correspond to a morphism
R --> R\coprod R. That is the co-operation. Using
the universal property of R\coprod R, it determines the
operation on all objects V(A).
- - -
The question of "which comes first" really depends on the situation
one chooses to look at. In the SL(n) situation, we first knew how
to multiply such matrices, and then translated this into a co-operation.
In the present study of functors Monoid --> Monoid, we don't know,
at the outset, what representable functors exist, so we are starting
with the properties that a representing monoid and a comultiplication
and co-neutral-element must have if it is to determine such a functor.
In any case, note that the operations of the representing object R
must be defined before we can speak of co-operations, since a
co-operation is a map to a coproduct, and the structure of a coproduct
of copies of R depends on the algebra structure of R.
- - -
Unless this clears the problem up completely, I suggest coming to
office hours to discuss it.
----------------------------------------------------------------------
Regarding Theorem 10.6.20 you ask,
> How do we know that multiple E-systems don't correspond to the same
> representable functor from monoids to monoids?
That's implied by Exercise 10.6:2. The unit of the adjunction
is the map taking each E-system X to PQ(X). If two nonisomorphic
E-systems X and X' had Q(X) isomorphic to Q(X'), then
PQ(X) would be isomorphic to PQ(X'); but the by exercise, they
are isomorphic to X and X' respectively, hence not to each
other. Intuitively, the result says that one can recover the
structure of X from P(X), namely by applying the functor Q.
(Theorem 10.6.20 is not very clearly stated; I have a notation to
rewrite it.)
----------------------------------------------------------------------
Regarding the last line of Theorem 10.6.20, you ask
> ... What is meant by the word "equivalence" ...
See Definition 7.9.5.
> ... and why isn't it italicized?
In an italic passage, de-italicization is used to show emphasis,
just as italicization is used in non-italic text. I'm not entirely
happy with that convention, since to my eyes, de-italicization doesn't
make a word stand out the way italicization does, and doesn't give
the same "feeling" of emphasis. But the convention is standard, and
I follow it.
I am emphasizing the word "equivalence" in this theorem because
it conveys the "punch" of the result: that E-systems give all the
information one could ask for about representable functors from
monoids to monoids: What the distinct structures are, and how they
can be mapped to each other.
----------------------------------------------------------------------
> Why is the natural correspondence between isomorphism classes
> as mentioned in thm 10.6.20 contravariant?
It goes back to the Yoneda Lemma, and the point discussed in
Remark 8.2.8: Because the hom bifunctor of a category C is
covariant in one variable and contravariant in the other, the
functor taking each object to the *covariant* hom functor that
it induces is *contravariant*. In the present chapter, this
comes up in the form: the functor taking a coalgebra object
to the covariant algebra-valued functor it represents is
contravariant; so for given C and V, the category of
co-V-algebras in C is equivalent to the *opposite* of
Rep(C,V). This was Corollary 10.3.6. Since the category of
E-systems is equivalent to the category of co-monoids in Monoid,
it is equivalent to the opposite of Rep(Monoid,Monoid).
----------------------------------------------------------------------
Regarding the first two diagrams-with-dots on the page following
Theorem 10.6.20, you write
> ... I don't see how it is decided that the E-system represented by
> the first boxes represents the identity functor and the second boxes
> represents the opposite monoid functor. Why can't the first one
> represent the opposite monoid functor and the second one the identity
> functor?
In the second sentence of the paragraph preceding these boxes, note
the word "respectively". Make sure you understand what it means,
and what consequences it has for the two coalgebras you ask about.
Then look at the functors those two coalgebras represent. If you
have trouble at some point in this path I've outlined, tell me where,
and I'll help you from there.
> I have a problem with seeing what difference first coalgebra having
> the comultiplication m(x)=x^{rho} x^{lambda} for all x, and the
> next one having the comultiplication m(x)=x^{lambda}x^{rho}
> for all x, make to the respective functors. ...
It's not "for all x"! It's for the one element x\in ||R|| that
has degree 2! The other elements x^n have higher degree, and
are mapped by the comultiplication to correspondingly more complicated
expressions.
> ... Do you mind pointing out where I should re-read to understand
> this?
I guess the best place to start is Definition 10.3.1. The
second paragraph of that definition describes the operations on
the functor represented by a coalgebra object. It begins by saying
that these are induced "under the dual of the construction of the
preceding section", but then gives an explicit description of that
construction. This Definition is written from the point of view of
going from the co-operation to the operation, but that should not be
a problem: Take the case where |R| is the free monoid on one
generator x, note how to identify the set-valued functor represented
by |R| with the underlying-set functor on monoids, form the coproduct
of two copies of |R|, call the generator of the first "x^\lambda"
and the generator of the second "x^\rho", then consider two
different co-operations |R| --> |R|^\lambda \coprod |R|^\rho, namely,
m_1 taking x to x^\lambda x^\rho,
and
m_2 taking x to x^\rho x^\lambda.
(And note, as I emphasized above, that these do not take every
element r of |R| to r^\lambda r^\rho or r^\rho r^\lambda.
You should see what they do to other elements.)
Then apply Definition 10.3.1 to find the binary operations on the
underlying-set functor Monoid -> Set induced by those two
co-operations. Hopefully, you will find that one of them
is the original operation of the monoids to which the functor
is applied, while the other is the opposite multiplication.
If you get stuck, come to office hours and go through it with
me. (Or if the problem is one that can easily be described
in e-mail, you can e-mail it to me.)
Once you see these examples, you will, hopefully, understand
how, given an operation on a representable set-valued functor, one
can go the other way, and find the co-operation on the representing
algebra that induces it.
Let me know how this goes. If I can identify the roadblocks that
keep students from understanding this material, I can hope to get
it across better in the future!
----------------------------------------------------------------------
You ask about "a nice description" of the left adjoints of the
representable functors Monoid --> Monoid (section 10.6).
Well, they all have the nice formal description as functors
associating to each monoid M a monoid gotten by attaching
together a bunch of copies of the representing monoid R, indexed
by the elements of M, with relations determined by the relations
of M and the comultiplication of R. But what the result looks
like for particular comonoids R can be complicated.
For the simplest interesting case: The left adjoint of the functor
associating to a monoid its group of invertible elements is the functor
associating to a monoid M the result of adjoining an inverse to
every element.
The next-simplest interesting case is the left adjoint of the functor
associating to M the monoid of pairs (a,b) in M with ab=e.
This adjoint adjoins to M, for every a\in M, an element a'
such that aa'=1, in such a way that the map a |-> a' reverses
the order of multiplication. Note that if in M one has ab=e,
then in the new monoid, b will have both the left inverse a
and the right inverse b', so it becomes invertible, so its left
inverse a also becomes invertible. So the elements that were 1-sided
invertible M all become invertible in the new monoid; hence the
submonoid generated by those previously-1-sided-invertible elements
is embedded in a group. Elements that were not previously
1-sided invertible need not become invertible, though they do
become 1-sided invertible.
I haven't studied those functors systematically ... .
----------------------------------------------------------------------
Regarding the results of section 10.6, you ask,
> ... We just described representable functors from MONOID
> into itself. Does this shed any light on representable functors
> from RING^1 into itself? (I'm asking because rings are monoids
> with extra structure.)
The problem with that approach is that determining representable
functors from Monoid to Monoid comes down to saying "These are the
only ways one can obtain an associative operation on (appropriate
sorts of) tuples of elements of a monoid, using the monoid operation
alone"; but when we are looking at functors on Ring^1, we aren't
limited to using "the monoid structure alone". Generally, if we are
looking at functors out of a given variety W, then restrictions on
the functors we can get into one variety V will also give us
information about restrictions on functors from W to other varieties
V' that in some sense "have a V-structure and more", but won't give
restrictions on functors from varieties W' that have "W-structure
and more"; inversely, existence results for representable functors
V --> W will give existence results on such functors V' --> W when
V' has "a V-structure and more", but will not give existence results on
functors V --> W' where W' has "a W-structure and more".
A description of all representable functors Ring^1 --> Ring^1 is,
however, obtained in my book with my first PhD student, Adam Hausknecht,
reference [2].
----------------------------------------------------------------------
You ask about the meaning of "subobject" in the paragraph following
display (10.7.1).
Good point. I guess I was implicitly relying on the fact that in
most real-world mathematical contexts, an idempotent endomorphism
of an object is a retraction to a subobject. This isn't a formal
statement true or even meaningful in an arbitrary category; so that
comment should be considered a heuristic observation, suggesting what
we should look for. It is valid with object taken to mean "category",
so it leads us to the right conclusion in this case.
----------------------------------------------------------------------
You ask about a generalization of Exercise 10.7:1(v). I hope
you would want to include (iii) and (iv) along with (v), since they
all show the same pattern.
In very general form, the pattern is that when one has a retraction of
a mathematical object X to an object Y, meaning a map f: X --> Y
which has a right inverse g: Y --> X, i.e., such that fg is
the identity morphism of Y, though gf may not be the identity
of X -- see paragraph containing (7.7.3) -- then maps
from any object Z to Y correspond to maps to h: Z --> X
such that h = gfh; and likewise maps Y --> Z correspond to
maps i: X --> Z such that i = igf. Namely, the map h: Z --> X
corresponds to fh: Z --> Y, and the map i: X --> Z corresponds
to ig: Y --> Z. You can verify that this gives bijections of sets
of maps in each case.
The situations occurring in the exercise are a little more complicated
in that the composites FU and UG are not quite the identity
functors of the categories in question, but isomorphic to the identity
functors. So one has to make a little adjustment (noted parenthetically
at the end of 10.7:1(iii), and then taken for granted in the remaining
parts).
----------------------------------------------------------------------
You ask whether our classification of representable functions
K-Mod --> L-Mod in section 10.8 requires that K and L have unity.
The classification could be carried out either with or without
that assumption. (In the latter case, of course, we would leave
out (10.8.6) and (10.8.11).) But the category of modules over an
object K of the category Ring is equivalent to the category
of (unital) modules over the object K^1 of Ring^1, where K^1
is the ring whose underlying additive group is the direct sum
of the additive groups of Z and of K, and whose multiplication
is defined in a way that uses the multiplication of K on pairs
of elements of K, and makes the 1 of Z the multiplicative
neutral element. Since rings of the form K^1 are far from all
rings (e.g., no field has that form), the categories we get by
considering nonunital base ring are more restricted than those we
get by studying unital modules over unital base rings. So it
seems best to study the unital case, and obtain results for
nonunital rings, whenever one needs them, as corollaries gotten
by applying the unital results over the rings K^1, L^1.
----------------------------------------------------------------------
You ask whether the concept of tensor product was motivated by
the considerations of section 10.9.
I think that the tensor product construction originated in physics,
where it was realized that the concept of "tension" in a solid --
the expression for the forces acting to stretch and compress the
material -- could be expressed as a member of a vector space, and
that this space had more dimensions than the 3-dimensional space R^3
in which the material lived, but that it was closely connected to
R^3, since every rotation of an object in R^3 induced a corresponding
transformation on the expression for tension. It was finally
worked out that it was a space generated by the image of a bilinear
map R^3 x R^3 --> V, and various spaces with such multilinear maps
were called "spaces of tensors"; and eventually a universal space S
with a multilinear map V x W --> S was called "the tensor product
of V and W". (Originally, for vector spaces V and W; later
for more general modules and bimodules.)
This is just the impression I've picked up; I've never studied
the history of the subject. But I'm sure that the relation with
composition of representable functors was realized much much later.
----------------------------------------------------------------------
Regarding the constructions C^pt and C^aug (Definition 10.10.1),
you write
> For a given category, we need not have a way to turn any object into
> a pointed object, as there need not be morphisms from the terminal
> object to every object, but all objects can be made augmented, right?
Nope. First, if we have an object X of C that doesn't admit a
morphism from the terminal object, then the corresponding object of
C^op won't admit a morphism to the initial object (since a morphism
from it to the initial object in C^op is just a morphism from the
terminal object of C to it).
But we don't have to go to such cases to get examples. In the
category CommRing^1, no ring that contains a field can be
augmented. (I.e., it can't have a homomorphism to Z.)
----------------------------------------------------------------------
Regarding the statement following Lemma 10.10.2 that when k has
nontrivial automorphisms and/or idempotent elements, the automorphism
class group of the variety of k-algebras has a more complicated
structure, you note that you see how automorphisms can be used, but
not idempotents.
If k = k_1 x k_2, then every k-algebra R can be written
R_1 x R_2, where R_1 is a k_1-algebra and R_2 is a k_2-algebra.
Hence we can construct the functor R_1 x R_2 |-> R_1 x R_2^{op}.
These constructions alone give a group isomorphic to the additive
group of the Boolean ring of idempotents of k. Combining these
with automorphisms of k, which in general permute the idempotents,
and hence induce automorphisms of the above group, we get a
semidirect product of the two groups.
----------------------------------------------------------------------
> Just to check my understanding: the results we state in 10.12
> about contravariant right adjunctions also work for contravariant left
> adjunctions, but we're only stating one version because contravariant
> left adjunctions are so rare compared to left adjunctions, right?
No. If you look at Definition 8.12.1, you will see that in the
case of contravariant right adjunctions, if C is a variety and
we insert in place of the dash in (8.12.2) the free object F_C(1)
on one generator in C, then the left-hand side gives the underlying
set of V(~). Combining with the right-hand side now shows that
the resulting contravariant set-valued functor is representable,
namely by U(F_C(1)). Hence the contravariant C-valued functor V is
representable by a C-algebra structure on the object U(F_C(1)) of D.
But if you turn to the contravariant left adjunction case, shown
by (8.12.3), if C or D is a variety, there is in general no
way of rendering either side of that formula as a description of
the underlying set of U or V as a representable functor. (If
C or D is the opposite of a variety of algebras, one can get such
descriptions; but this just translates us to the covariant-adjunction
case.)
> Assuming an affirmative answer to the first question: Why are
> contravariant left adjunctions so rare?
I'll give you a reprint of my paper.
----------------------------------------------------------------------
> Are there any examples of functors T: C^\op \to C such that T and
> TT have left and right adjoints but TTT does not?
I don't know.
One conceivable way to approach the question would be to have
C be a direct product of, say, 4 categories C_0 x C_1 x C_2 x C_3,
and have T built out of functors C_i --> C_{i+1} (i=0,1,2),
such that some sort of bad behavior is reached only when we
carry C_0 into C_3. I don't have anything detailed in mind, but
you might be able to go somewhere with the idea.
----------------------------------------------------------------------
You ask about "the intuition for deriving representable functors
$V^op --> W$ from $V \bigcirc W$" (section 10.13).
The idea is that given an object $R$ of $V \bigcirc W$", it has both
a $V$ structure and a $W$ structure. Because of the former, one
can associate to every object $A$ of $V$ the set of homomorphism
$V(A,R)$, and because of the latter, one can apply any n-ary
operation of $W$ pointwise to these V-homomorphisms. Finally, the
"commutativity" relations have the consequence that the result of
applying a W-operation pointwise to a tuple of V-homomorphisms is
again a V-homomorphism, making our set of V-homomorphisms a W-object.
As noted in class, duality of vector spaces is an example (with V=W).
----------------------------------------------------------------------