ANSWERS TO QUESTIONS ASKED BY STUDENTS in various classes about points
in my handout "Some Basic Symbols and Concepts of Set Theory, Logic,
and Mathematical Language"
----------------------------------------------------------------------
You ask about the choice of whether to include 0 among the natural
numbers (p.1).
For mathematical purposes, it is most natural to include 0. For
instance, the natural numbers are what one uses to count elements in
finite sets, and the empty set is a finite set.
The only argument I can see for not including it is that zero is
a more "sophisticated" concept than 1, 2, 3, etc.; e.g., European
languages had no word for "zero" till the late Middle Ages. But I think
that was really a result of people's not feeling a need to name the
obvious. (One could argue similarly that "1" is not a natural number,
because numbers are used to count collections of things, and intuitively
a single thing is not a collection. And on the basis of certain
languages, one could push this even further and argue that 2 is not a
natural number!)
I think most mathematicians today consider the set of natural numbers
to include 0. Unfortunately, many textbook-writers use definitions
from the textbooks they used when they were in school, so that old
terminology remains in textbooks long after it has been rejected by
practicing mathematicians. (I call the tendency not to treat the number
zero and the empty set like other numbers and sets in cases where they
really deserve to be treated alike "nulliphobia".)
----------------------------------------------------------------------
You ask why some authors, such as Rudin, use the horseshoe symbol for
"subset" rather than "proper subset".
The idea is that "subset" is a more primitive concept than "proper
subset", so that it ought to be represented by the more primitive
symbol. I have read that there was at one time a movement to similarly
use "<" to mean "less than or equal to". This and the use of
the horseshoe for "subset" would have made a consistent system;
with "proper subset" and "strictly less than" being expressed, when
needed, by adding a "not-equal" sign under the symbol in question. But
the interpretation of "<" as "less than" was too deeply entrenched to
be changed, and as a result, the corresponding usage for subsets has
prevailed; though the other usage has not entirely died out.
----------------------------------------------------------------------
You ask why mathematicians haven't agreed on a uniform notation (e.g.,
for symbols for subsets and proper subsets, p.1), the way chemists have.
I am torn between trying to analyze this objectively, and trying
to defend mathematicians. And also torn because I still have a
large number of Wednesday's "questions of the day" left to answer,
so I can't let myself take long on this.
I certainly agree that it would be convenient if mathematicians
agreed on a common notation for "subset".
On the other hand, a very basic attitude of mathematicians is that
there is no "correct" name or symbol for something; rather, correctness
lies in the deductions you make after you say what your words or
symbols will mean. There are worse and better choices of definitions
and notation, but these are worse or better according to their fruits,
i.e., in that they make it easier or harder to state or prove elegant
and powerful results.
So while the tendency to criticize one another's notation and
to seek uniformity is not absent in mathematicians, it is probably
less strong among mathematicians than among people in other fields.
A mathematician is willing to read another mathematician's paper,
adjusting to his or her different terminology, if the mathematical
content is interesting.
Another reason is that, although it is true that some concepts such
as that of "subset" are common to all areas of mathematics, in other
respects the various areas of mathematics deal with enormously different
topics, which it is natural to denote using different notations. So
mathematics is not as coherent a conceptual community as is chemistry,
hence it has not had as much of a tendency to develop standardized
conventions.
----------------------------------------------------------------------
You ask about the meanings of "necessary" and "sufficient" in
Exercise 1, p.3.
If P and Q are statements, then to say P is a "necessary
condition" for Q to hold is to say Q implies P -- the idea
behind the words is "Q can't happen unless P happens". Inversely,
to say P is a sufficient condition for Q to hold is to say that
P implies Q -- the idea here being that to be sure Q is true,
it is enough (sufficient) to know that P is true. These words are
also used in everyday life in these ways.
----------------------------------------------------------------------
You ask about the statement on p.2, that index-sets don't have to be
sets of integers.
As an example where we would want to index a family of sets by elements
other than integers, suppose we were looking at all disks of radius 1
in the plane. Each such disk is determined by its center-point, which
can be any point (x,y) of the plane. So the set of such disks would
be \{D_{(x,y)} | (x,y)\in R x R\}. Or, letting I = R x R,
\{D_i | i\in I\}.
----------------------------------------------------------------------
You ask about the definition of a function as a subset of a direct
product (p.3).
Well, in setting up set theory, the question is how many "primitive"
concepts one needs, and to what extent, on the contrary, one can
derive some concepts from others. Naively we might think of "function"
as a primitive concept. But when we analyze it, we see that a function
X --> Y is determined by "listing" each element of X, and the
element of Y that is associated to it. So this "list" of pairs,
(x, f(x)), determines what we think of as the function. Hence
rather than making "function" a primitive concept in their development
of set theory, set theorists _define_ a function to be a set of ordered
pairs with the appropriate property.
However, having satisfied ourselves that we can define "function" in
this way, we mostly continue to use it in the way to which we are
accustomed. In particular, we won't make use of the description of a
function as a set of ordered pairs in this course.
----------------------------------------------------------------------
You ask whether in the graph of a function f: X --> Y (p.3), there
has to be an ordered pair (x,y) for every x in the domain X of
the function.
If you are asking whether a function has to give a value for every
element of its domain, the answer is yes. (Something that assigns
values only to a subset of X can be called a "partial function"
on X. These come up a lot, e.g., when one considers 1/x on the
real line. But generally, we don't use the language of partial
functions, but simply say that this is a function whose domain is
a subset of R, rather than R itself.)
If, on the other hand, you understood that f: X --> Y gives a
value for all x\in X, but you were unsure whether this carried over
to the concept of "graph" as defined in the notes, then you need
to clarify your understanding of set notation: I described the graph
of f as {(x,f(x)) | x\in X}; this means the set of _all_ pairs
(x,f(x)) that one gets by taking x\in X.
(In either case, the answer to the question is "yes".)
----------------------------------------------------------------------
Regarding the statement on p.4 that P ==> Q "is considered to be true
in all cases except those where P is true but Q is false", you ask
"by definition of P implying Q, how could such cases exist?"
Well, we're talking about "P ==> Q" as a statement -- a statement which
may be true or false. When it is _true_, then the case mentioned can't
occur. But we do have to think about the possibility that P is true
and Q false in discussing whether _or_not_ P ==> Q is true. As the
handout says, it is precisely when that happens that we say "P ==> Q"
is false.
Things get confusing when we talk about how we talk, but I hope
this helps.
----------------------------------------------------------------------
Regarding the warning at the top of p.6, you ask "If X and Y are sets,
why can we not make the statement that X implies Y?"
"Implies" is a relation between statements; a relationship between
the truth of one statement and the truth of another. One doesn't
have a concept of a set being "true" or "false", so I don't see what
meaning one would give to one set "implying" another. I wish you had
written what you thought "X implies Y" meant for sets X and Y, so
that I could have responded better to your question. As I say on
the class handout, it helps to include these details, since it isn't
like office hours where I can question you about what you had in mind.
----------------------------------------------------------------------
You asked how to read and interpret the formula in Exercise 4(a):
\not (\exists x) (\not P(x)).
I'm not sure what you mean by "read". If it were written on a
blackboard and I was talking with someone about it, I might say
"there does not exist x such that not P-of-x". Someone else who
wanted to stay closer to the symbols might say "not there-exists x
not P-of-x".
As to whether the first "not" applies to "the first statement only"
or "the whole of both statements" -- there isn't any "first statement".
The symbol "\exists x" is not a statement; it means "there exists
an x such that ...", and its effect is to modify the meaning of a
following statement about x. So that first "not" can only apply to the
whole statement one gets by combining "\exists x" with "\not P(X)".
After you apply the "not", the meaning of the statement is that there
does not exist any x for which P(x) isn't true.
----------------------------------------------------------------------
You ask why I say on p.8 that "one cannot ask whether the statement
(\exists x\in Z) x^5 = x is `true for x=3', but only whether for
x=3 the equation x^5=x holds".
If one asks whether "(\exists x\in Z) x^5 = x is true for x=3", one is
giving contradictory instructions: One instruction (coming from "there
exists"), is to check the equation for _all_ x\in Z and say "yes" if
it holds for _at_least_one_ x. The other instruction is to check the
equation for x=3, and say "yes" if it holds for that value. If you
have those two contradictory instructions, which should you follow?
On the other hand, if one asks whether for x=3 the equation x^5=x holds,
one is just instructed to check it for x=3, and there is no
contradiction.
----------------------------------------------------------------------
You ask about the use of the word "arbitrary" in proofs where one
says something like "Since a is an arbitrary vector in A, ...".
Good question. Here "arbitrary" does not have a mathematical meaning;
the phrase "a is an arbitrary vector ..." means "We did not assume
any restrictions on the vector a". (E.g., we did not assume it was
nonzero, or had integer coordinates, or whatever restrictions might
have been possible in the context.) The consequence is that whatever
we have proved about "a" in these circumstances is true of all
vectors. (If we had assumed a had integer coordinates, then we could
only say that those conclusions were true of vectors with integer
coordinates, etc..)
Another place where "arbitrary" comes up is in stating hypotheses.
There, it usually means that some assumption that is lurking in the
context is not to be assumed. So if one says "Let X be a finite
subset of A, and Y an arbitrary subset of A", the word "arbitrary"
is there to emphasize that the assumption "finite" is not to be assumed
for Y. Sometimes it's not so clear what assumptions the author means
to have you avoid; one can simply interpret it as saying "If you were
thinking that some unstated restriction applied -- don't!"
----------------------------------------------------------------------