ANSWERS TO QUESTIONS ASKED BY STUDENTS in various classes about points in my handout "Some Basic Symbols and Concepts of Set Theory, Logic, and Mathematical Language" ---------------------------------------------------------------------- You ask about the choice of whether to include 0 among the natural numbers (p.1). For mathematical purposes, it is most natural to include 0. For instance, the natural numbers are what one uses to count elements in finite sets, and the empty set is a finite set. The only argument I can see for not including it is that zero is a more "sophisticated" concept than 1, 2, 3, etc.; e.g., European languages had no word for "zero" till the late Middle Ages. But I think that was really a result of people's not feeling a need to name the obvious. (One could argue similarly that "1" is not a natural number, because numbers are used to count collections of things, and intuitively a single thing is not a collection. And on the basis of certain languages, one could push this even further and argue that 2 is not a natural number!) I think most mathematicians today consider the set of natural numbers to include 0. Unfortunately, many textbook-writers use definitions from the textbooks they used when they were in school, so that old terminology remains in textbooks long after it has been rejected by practicing mathematicians. (I call the tendency not to treat the number zero and the empty set like other numbers and sets in cases where they really deserve to be treated alike "nulliphobia".) ---------------------------------------------------------------------- You ask why some authors, such as Rudin, use the horseshoe symbol for "subset" rather than "proper subset". The idea is that "subset" is a more primitive concept than "proper subset", so that it ought to be represented by the more primitive symbol. I have read that there was at one time a movement to similarly use "<" to mean "less than or equal to". This and the use of the horseshoe for "subset" would have made a consistent system; with "proper subset" and "strictly less than" being expressed, when needed, by adding a "not-equal" sign under the symbol in question. But the interpretation of "<" as "less than" was too deeply entrenched to be changed, and as a result, the corresponding usage for subsets has prevailed; though the other usage has not entirely died out. ---------------------------------------------------------------------- You ask why mathematicians haven't agreed on a uniform notation (e.g., for symbols for subsets and proper subsets, p.1), the way chemists have. I am torn between trying to analyze this objectively, and trying to defend mathematicians. And also torn because I still have a large number of Wednesday's "questions of the day" left to answer, so I can't let myself take long on this. I certainly agree that it would be convenient if mathematicians agreed on a common notation for "subset". On the other hand, a very basic attitude of mathematicians is that there is no "correct" name or symbol for something; rather, correctness lies in the deductions you make after you say what your words or symbols will mean. There are worse and better choices of definitions and notation, but these are worse or better according to their fruits, i.e., in that they make it easier or harder to state or prove elegant and powerful results. So while the tendency to criticize one another's notation and to seek uniformity is not absent in mathematicians, it is probably less strong among mathematicians than among people in other fields. A mathematician is willing to read another mathematician's paper, adjusting to his or her different terminology, if the mathematical content is interesting. Another reason is that, although it is true that some concepts such as that of "subset" are common to all areas of mathematics, in other respects the various areas of mathematics deal with enormously different topics, which it is natural to denote using different notations. So mathematics is not as coherent a conceptual community as is chemistry, hence it has not had as much of a tendency to develop standardized conventions. ---------------------------------------------------------------------- You ask about the meanings of "necessary" and "sufficient" in Exercise 1, p.3. If P and Q are statements, then to say P is a "necessary condition" for Q to hold is to say Q implies P -- the idea behind the words is "Q can't happen unless P happens". Inversely, to say P is a sufficient condition for Q to hold is to say that P implies Q -- the idea here being that to be sure Q is true, it is enough (sufficient) to know that P is true. These words are also used in everyday life in these ways. ---------------------------------------------------------------------- You ask about the statement on p.2, that index-sets don't have to be sets of integers. As an example where we would want to index a family of sets by elements other than integers, suppose we were looking at all disks of radius 1 in the plane. Each such disk is determined by its center-point, which can be any point (x,y) of the plane. So the set of such disks would be \{D_{(x,y)} | (x,y)\in R x R\}. Or, letting I = R x R, \{D_i | i\in I\}. ---------------------------------------------------------------------- You ask about the definition of a function as a subset of a direct product (p.3). Well, in setting up set theory, the question is how many "primitive" concepts one needs, and to what extent, on the contrary, one can derive some concepts from others. Naively we might think of "function" as a primitive concept. But when we analyze it, we see that a function X --> Y is determined by "listing" each element of X, and the element of Y that is associated to it. So this "list" of pairs, (x, f(x)), determines what we think of as the function. Hence rather than making "function" a primitive concept in their development of set theory, set theorists _define_ a function to be a set of ordered pairs with the appropriate property. However, having satisfied ourselves that we can define "function" in this way, we mostly continue to use it in the way to which we are accustomed. In particular, we won't make use of the description of a function as a set of ordered pairs in this course. ---------------------------------------------------------------------- You ask whether in the graph of a function f: X --> Y (p.3), there has to be an ordered pair (x,y) for every x in the domain X of the function. If you are asking whether a function has to give a value for every element of its domain, the answer is yes. (Something that assigns values only to a subset of X can be called a "partial function" on X. These come up a lot, e.g., when one considers 1/x on the real line. But generally, we don't use the language of partial functions, but simply say that this is a function whose domain is a subset of R, rather than R itself.) If, on the other hand, you understood that f: X --> Y gives a value for all x\in X, but you were unsure whether this carried over to the concept of "graph" as defined in the notes, then you need to clarify your understanding of set notation: I described the graph of f as {(x,f(x)) | x\in X}; this means the set of _all_ pairs (x,f(x)) that one gets by taking x\in X. (In either case, the answer to the question is "yes".) ---------------------------------------------------------------------- Regarding the statement on p.4 that P ==> Q "is considered to be true in all cases except those where P is true but Q is false", you ask "by definition of P implying Q, how could such cases exist?" Well, we're talking about "P ==> Q" as a statement -- a statement which may be true or false. When it is _true_, then the case mentioned can't occur. But we do have to think about the possibility that P is true and Q false in discussing whether _or_not_ P ==> Q is true. As the handout says, it is precisely when that happens that we say "P ==> Q" is false. Things get confusing when we talk about how we talk, but I hope this helps. ---------------------------------------------------------------------- Regarding the warning at the top of p.6, you ask "If X and Y are sets, why can we not make the statement that X implies Y?" "Implies" is a relation between statements; a relationship between the truth of one statement and the truth of another. One doesn't have a concept of a set being "true" or "false", so I don't see what meaning one would give to one set "implying" another. I wish you had written what you thought "X implies Y" meant for sets X and Y, so that I could have responded better to your question. As I say on the class handout, it helps to include these details, since it isn't like office hours where I can question you about what you had in mind. ---------------------------------------------------------------------- You asked how to read and interpret the formula in Exercise 4(a): \not (\exists x) (\not P(x)). I'm not sure what you mean by "read". If it were written on a blackboard and I was talking with someone about it, I might say "there does not exist x such that not P-of-x". Someone else who wanted to stay closer to the symbols might say "not there-exists x not P-of-x". As to whether the first "not" applies to "the first statement only" or "the whole of both statements" -- there isn't any "first statement". The symbol "\exists x" is not a statement; it means "there exists an x such that ...", and its effect is to modify the meaning of a following statement about x. So that first "not" can only apply to the whole statement one gets by combining "\exists x" with "\not P(X)". After you apply the "not", the meaning of the statement is that there does not exist any x for which P(x) isn't true. ---------------------------------------------------------------------- You ask why I say on p.8 that "one cannot ask whether the statement (\exists x\in Z) x^5 = x is `true for x=3', but only whether for x=3 the equation x^5=x holds". If one asks whether "(\exists x\in Z) x^5 = x is true for x=3", one is giving contradictory instructions: One instruction (coming from "there exists"), is to check the equation for _all_ x\in Z and say "yes" if it holds for _at_least_one_ x. The other instruction is to check the equation for x=3, and say "yes" if it holds for that value. If you have those two contradictory instructions, which should you follow? On the other hand, if one asks whether for x=3 the equation x^5=x holds, one is just instructed to check it for x=3, and there is no contradiction. ---------------------------------------------------------------------- You ask about the use of the word "arbitrary" in proofs where one says something like "Since a is an arbitrary vector in A, ...". Good question. Here "arbitrary" does not have a mathematical meaning; the phrase "a is an arbitrary vector ..." means "We did not assume any restrictions on the vector a". (E.g., we did not assume it was nonzero, or had integer coordinates, or whatever restrictions might have been possible in the context.) The consequence is that whatever we have proved about "a" in these circumstances is true of all vectors. (If we had assumed a had integer coordinates, then we could only say that those conclusions were true of vectors with integer coordinates, etc..) Another place where "arbitrary" comes up is in stating hypotheses. There, it usually means that some assumption that is lurking in the context is not to be assumed. So if one says "Let X be a finite subset of A, and Y an arbitrary subset of A", the word "arbitrary" is there to emphasize that the assumption "finite" is not to be assumed for Y. Sometimes it's not so clear what assumptions the author means to have you avoid; one can simply interpret it as saying "If you were thinking that some unstated restriction applied -- don't!" ----------------------------------------------------------------------