----------------------------------------------------------------------
You ask about the complex log being many-valued (p.4).  This is in
the same sense that the "argument" defined on p.17 is many-valued.

Of course, by definition, a function has to be single-valued.  But
nevertheless, the question "for each  z\in C,  what complex numbers
w  satisfy  e^w = z ?" gives infinitely many solutions; so when we
come to this topic, we will have to figure how to introduce a function
or other mathematical entity that will give us solutions to that
equation, yet will be well-defined.  We will discover several ways to
do this:  There will be a "principal value" for the logarithm as there
is for the argument function; later we will introduce a way of modifying
the domain-set of the logarithm so that we get a function which gives
all values the logarithm.

----------------------------------------------------------------------
You ask, regarding the 300-year-long worry about the meaning of
complex numbers (p.7):

> ...  Isn't a real number with a repeating
> decimal is just as invented as the number i?

From a philosophical point of view, I suppose so.  But "approximate real
numbers" arise in the everyday process of measuring lengths, weights,
etc.; and the system of "pure real numbers" can be thought of as a
model of this real-life concept.  I believe that it wasn't until the
late 19th century that people thought about _constructing_ the system
of real numbers; until then they just took it for granted, as the
Greeks took "the plane" for granted.

The troublesome thing about the square root of -1 was that it didn't
correspond even approximately to anything in the real world.  It is
true that the square root of -1 can be represented by a certain point
of the plane, which is "concrete"; but nothing about the plane shows
that this point has square equal to -1, which is its essential
problematic feature.

----------------------------------------------------------------------
You ask, in connection with the definition of  C  on p.10, about spaces
such as  C x C,  and how they can be represented by tuples of numbers.

A point  p  of  C x C  will be determined by two complex numbers, and
each of these is determined by two real numbers, so altogether  p
can be thought of as determined by a quadruple (or as mathematicians
more often say, a 4-tuple) of real numbers.

On the other hand, for many purposes, the description of  C x C  as
consisting of pairs of complex numbers is more useful.

This ambiguity leads to a certain awkwardness of language;
mathematicians can't simply refer to  C x C  as having a certain
dimension, they must use one of the explicit phrases "real dimension 4"
or "complex dimension 2" to describe the same object.  (Of course, in
a given piece of mathematical writing, if these dimensions will come
up repeatedly, then the author will set a convention, "In this work,
`dimension' will always mean real dimension", or if he or she prefers,
"...will always mean complex dimension".  Then he or she can thereafter
say "dimension 4" or "dimension 2" without ambiguity.

But that won't be a problem in this course; we will generally be
studying open sets in  C,  which all have real dimension 2, complex
dimension 1; so we won't have to make distinctions of dimension.

----------------------------------------------------------------------
You ask, in connection with definition of complex numbers on pp.11-12,
whether the results of complex analysis would have been the same if one
had started with a square root of -2 instead of a square root of -1.

Yes.  The sets of expressions of the form  x + y sqrt(-1)  and
p + q sqrt(-2)  (where  x, y, p, q  all range over the real numbers)
form isomorphic fields; an isomorphism is given by  x + y sqrt(-1) |->
p + q sqrt(-2)  where  p=x,  q = y/sqrt 2.  They also have the
same topological structure, which essentially means that the above
isomorphism and its inverse map are both continuous as functions of
the coordinates.  Together, these would lead to the same theory, though
many formulas occurring in that theory would have different constants.
But calculations are certainly nicer when  sqrt -1  is used.

----------------------------------------------------------------------
You ask, in connection with the geometric representation of the complex
numbers (p.13), about representing functions  C --> C  geometrically.

If one tried to "graph" them, the graph would have to be 4-dimensional,
which is impractical.  However, one can visualize them as moving various
regions of  C  to other regions of  C.  In particular, if one draws
some regular system of lines etc. in a region of the domain, and shows
their images in the codomain, this can give a reasonable understanding
of such a function; see the top illustrations on p.163.  (Incidentally,
the one at the top of that page shows the kind of function we will be
interested in in this course, while the second one shows a kind that we
will not be interested in in this course.)

Another thing one can do is make a 3-dimensional graph showing the
behavior of the real or complex part of the function.  A lot of
plaster models of such things were made around 100 years ago.  I think
there are some in a glass cabinet in 1015 Evans (northwest corner of
the room).

----------------------------------------------------------------------
You ask about the reason for the difference between the inequality
satisfied by the dot product of vectors in  R^k,  and equation (11)
on p.15, satisfied by multiplication of complex numbers.

I'm not sure what you want to know.  The basic answer is that they are
two different operations, and there is no reason to expect different
operations to have the same properties.  There are about about three
things your question might mean: (1) "If you can prove that we have
equality in the case of multiplication of complex numbers, won't the
same proof show that this is true for dot product of vectors?"  To
answer that, check that you understand the proof for multiplication of
complex numbers, and see what happens if you try to apply it to
vectors.  (2) "What is the difference between the way I should picture
multiplication of complex and dot product of vectors, that would show
this difference?"  (3) "What is it about the k=2 case that leads to
the existence of an operation with properties "better" than those
satisfied by dot product for general k ?"  To ask each of these
questions more or less assumes that you understand the answer to the
one before.  Think about them, and if you still have questions, ask
again, by e-mail or in office hours.

I guess yet another thing you might mean is (4) "Shouldn't any two
things called by the same name have the same properties?", and the
answer is definitely "No!"  Different people are different; different
number-systems -- real, complex, rational, ... are different, etc..
The world would be a much duller place if they weren't.  We try to
use a given name for things that have some basic properties in common,
but they won't be identical.

----------------------------------------------------------------------
You ask how, on p.15, the authors go from the display before the line
"which takes the form" to the display after that line.

Maybe your difficulty comes from not noticing that after the second
display the authors say "where  z_1 = x_1 + i y_1,  z_2 = x_2 + i y_2".
With that definition, is it clear to you that application of the
definition of modulus to the left-hand-side of the first line gives
the left-hand-side of the second line?  (On the right-hand side, they
are simply multiplying out the square.)

I prefer to give definitions before rather than after the place where
they are used; but the latter is a common practice, so you should get
in the habit of checking whether some symbols are defined immediately
after the place where they appear.

----------------------------------------------------------------------
You ask how on p.16, final display, the authors go from  conj(z_3^7)
to  conj(z_3)^7.

They are applying equation (13) on that page, which says that the
conjugate of a product is the product of the conjugates.  Since a
positive integer power of a number is a repeated product, the conjugate
of such a power is the power of the conjugate.

The wording of your question asks about going "directly" to
conj(z_3)^7;  so perhaps you saw the method but thought that more steps
should be shown.  The general rule is that steps should be shown if one
doesn't feel that one's audience will easily be able to figure them
out for themselves.  As one progresses in a field, one should be
able to work out more and more for oneself, and more and more the
authors' function to supply the genuinely difficult steps or ideas.

----------------------------------------------------------------------
In connection with the authors' definition on p.24, that a set is
closed if and only if its complement is open, you ask whether the
complement of an open set is always closed.

Yes!  Given an open set  U,  let's see whether  S = C\U  fits the
definition of a closed set.  To do this, we take its complement.  This
is  C\(C\U) = U,  which is open.  So by definition,  S  is closed.

----------------------------------------------------------------------
You ask, concerning the definition of limit point on p.24, "Is the
neighborhood here only for some epsilon or for all epsilons?"

The authors write that  z_0  is a limit point "if every  N_epsilon(z_0)
contains a point of  S  other than  z_0".  So to understand this, we
have to understand what they mean by "every N_epsilon(z_0)".  A disc
N_epsilon(z_0)  is determined by two elements, the point  z_0  and
the positive real number epsilon.  But in this discussion, we are
assuming a particular z_0  given, about which we want to decide whether
it is a limit point, so "every" can't mean "for every point  z_0".  So
it must mean "for every positive real number epsilon."

There's one more step needed to correctly understand the definition.
We have figures out that it refers to "every positive real number
epsilon" and we see that it concerns the condition that  N_epsilon(z_0)
contains a point of  S  other than  z_0;  i.e., "there exists a point
w  such that  w\in N_epsilon(z_0),  w\in S,  and  w not-= z_0".  Now
we have to know which way these conditions are put together; i.e.,
whether the definition means

[a]  For every positive real number epsilon there exists a point
w  such that  w\in N_epsilon(z_0), w\in S,  and  w not-= z_0,

or

[b]  There exists a point  w  such that for every positive real
number epsilon,  w\in N_epsilon(z_0), w\in S,  and  w not-= z_0.

(The importance of such differences is discussed in the last pages
of the handout on sets and logic that I distributed.)

One can guess fairly well from the wording of the sentence in the book
that it means [a] and not [b].  (I.e., for each  epsilon  one can find
a point  w  that works, but it is not asserted that there is one
w  that works simultaneously for all epsilon.)  One can also think
about the geometry, and realize that [b] can never be true; so [a]
must be what they mean.

----------------------------------------------------------------------
Regarding Proposition 2.1, p.25, you write

> ...  It seems to me that even if S is open, it still contains
> all its limit points (every point of an open set is a limit point).

For  S  an open set, you're asking whether it isn't true that "if  x
is a limit point of  S,  then  x  belongs to  S", but the argument you
give in parenthesis shows "if  x  belongs to  S  then  x  is a limit
point of  S".  In other words, the converse of what you are asking
about.

An example showing that an open set need not contain its limit points
is the set  S = N_1(0),  i.e., the open unit disc.  Points at distance
1  from the origin are limit points of  S,  but not contained in  S.

----------------------------------------------------------------------
You ask how one can come up with the start of the proof of part (iii)
of Proposition 2.2, p.26, where we "add and subtract  l g(z)".

That part of the Proposition says that if  f(z)  approaches  l  and
g(z)  approaches  k  then  f(z) g(z)  approaches  l k.  This comes down
to showing that if  f(z)  is sufficiently near to  l  and  g(z)  is
sufficiently near to  k,  then  f(z) g(z)  will be near to  l k.  To
analyze how near they are, lst us go from  f(z) g(z)  to  l k  in two
steps, first changing  f(z)  to  l,  and then changing  g(z)  to  k;
i.e., we go "f(z)g(z)  to  l g(z)  to  l k".  The change at the
first step is  (f(z)-l)g(z),  which will be small if  f(z)-l  is small
enough, and the change at the second step is  l(g(z)-k),  which will
be small if  g(z)-k  is small enough.  So we see that this calculation
breaks the difference  f(z) g(z)  -  l k  into two summands,
(f(z) g(z) - l g(z)) + (l g(z) - l k).  When we write it out
algebraically, it comes out as "adding and subtracting l g(z)".

----------------------------------------------------------------------
You ask about the meaning of the functions  u  and  v  introduced
on p.28, third display.

There  f: S --> C  is a function, where  S  is a subset of  C.
Since each complex number  z  has the form  x+iy  for real numbers
x  and  y,  one way of describing  f  is to specify, for each
pair  (x,y)  such that  x+iy\in S,  the complex number  f(x+iy).
That complex number can be described by giving its real and imaginary
parts.  Since these depend on  x  and  y,  we can call them
u(x,y)  and  v(x,y).  Hence the complex-valued function  f  of one
complex variable is determined by two real-valued functions  u  and
v,  each a function of two real variables.

The point of all this is so that mathematical results that one
has proved for functions of several real variables can be applied
when studying functions of a complex variable; and likewise that
understanding one has developed of functions of several real
variables can be applied to functions of a complex variable.

I'm not sure what the source of your problem was, hence whether what
I have written answers it.  Now that you've read the above, could
you write me a few lines about what it was that you were missing,
and whether it has been answered?

----------------------------------------------------------------------
You ask for a geometrical explanation of the fact that a function is
always considered continuous at an isolated point (p.28, last line),
saying "... The only picture I can think of is a discontinuity ...".

Well, this depends on what geometrical idea of continuity and
discontinuity you have picked up!  The claim that "continuous" means
"you can draw it without lifting your pencil from the page" is 
a primitive one that only works when the domain is an interval.
A better one might be expressed as "no conflict between what  f(x)
is and what nearby values suggest that it should be".  For instance,
consider the function  f  on  R  defined by letting
		f(x) = 0  if  x <  0,
		f(x) = 1  if  x >_ 0.
This is discontinuous at  x = 0  because if we look at the value that
we get as we approach from the left, namely 0, the value at the point
itself, namely 1, and the value we get as we approach from the right,
which is also  1,  there is a conflict -- the value we get approaching
0  from the left is not the same as the value at  0  (even though the
value we get approaching  0  from the right is the same as  f(0)).
On the other hand, if we restrict  f  to nonnegative inputs, the
conflict disappears, because we can no longer approach from the left
and get a different value; so that restriction is a continuous function
on the nonnegative reals.  Finally, if we restrict  f  to the 1-point
set {1}, then it remains continuous -- there is certainly no conflict
because there is nothing to conflict with the value it has there!

I hope this puts you on the right track.  If it doesn't help, come by
my office hours about it.

----------------------------------------------------------------------
You ask for an example where a relatively open set (p.29) is not open.

If  S  is a subset of  C,  every subset of  S  that is relatively
open in  S  is gotten by taking an open subset  U  of  C,  and
forming  S \intersect U.  (This is a result we say in my section
of Math 104; I would expect you would also have seen it in yours.)
If  S  itself is open, then  S \intersect U  will also be open, so
in that case, the relatively open subsets are just the open subsets.
However, if we take any  S  that is not open, then we can easily
get relatively open subsets that are not open.  The easiest way is
to take  U = C;  then we see that  S  itself is a relatively open
subset of  S  (and you can check that the definition given in our
book confirms that), but by assumption, not open.  More generally,
if  S  is not open it will contain a point  p  that is a limit
point of its complement.  If we take any open set  U  containing
p,  we will find that  S \intersect U  is relatively open in  S,
but is not open in  C.

You should draw an example or so of non-open sets  S  and choices of
U  as above, to see concrete cases of the above general observations.

----------------------------------------------------------------------
You ask, in connection with the observation on p.29 that not every
relatively open set is open, whether every open set is relatively open.

Note that the concept of "relatively open set" is only defined for
subsets  V  of a given set  S.  In other words, it is only subsets
of  S  that we can talk about as being "relatively open in  S".  If
we keep this condition in mind, the answer is yes:  Every _subset_of_S_
which is open in  C  is also relatively open in  S.  This follows
easily on comparing the definitions of "open" and "relatively open".

----------------------------------------------------------------------
You asked what use Proposition 2.4, p.30, could be, if it means that
checking continuity requires looking at every possible open set.

As I mentioned in class, one reason this translation of continuity is
important is that there are other contexts in which one has a concept
of open set, but not a concept of distance, so that one can't define
continuity by epsilons and deltas, but this criterion still makes
sense, and can be taken as the definition.

Moreover, a result about continuity can have other value than
as a method of verifying that particular functions are continuous.
For instance, Proposition 2.6 follows trivially from Proposition 2.4!

In contexts where one has a definition of open set that is not based
on distance, one often has nonetheless some family  X  of open sets
with the property that a set  U  is open if and only every point of
U  lies in a member  N  of  X  that is a subset of  U.  In our context,
the set of discs  N_epsilon(z)  plays the role of this  X.  In general,
one can use members of such a set  X  to prove particular sets are
compact in essentially the way we have used discs.

----------------------------------------------------------------------
You ask, in connection with the authors' remark on p.33 that paths
can be very complicated, what a space-filling curve is.

It's a path whose image is 2-dimensional.  In the text from which I
taught 104 last Semester, Rudin's "Principles of Mathematical Analysis",
Exercise 14 on p.168 shows how to construct a continuous function from
the interval [0,1] onto the unit square, {(x,y) | x,y\in [0,1]}.
(The same text is being used this Semester for sections 2 and 4 of
Math 104, so you should be able to look at a copy in the store.)

----------------------------------------------------------------------
I hope what I said in class helped with your problem with the Paving
Lemma (p.39).  In summary -- it is quite true that if we just start
producing _in_an_arbitrary_way_ successive intervals in [a,b] each
of which has image in a disc in  S,  the discs might "dramatically
decrease in size" so as to render reaching  b  impossible.  That
is, there are _some_ ways of attempting to pave the path where
we "pave ourselves into a corner".  But the proof of the Lemma
shows that there will nonetheless be _other_ ways of doing it that
don't fail.  We show this by considering all possible partial
pavings, and showing that if none of them were a total paving, we
could get a contradiction.

----------------------------------------------------------------------
In connection with the Paving Lemma, p.39, you draw a picture of a
domain  S  that narrows to a sharp point, and a path  gamma  ending
at that point, and ask "How to we pave  gamma  in  S ?"

Good question!  To answer it, we have to think about what your picture
really shows.  Are the "edges" and "corners" of the region that you
draw to be understood to belong to  S ?  No, they can't, otherwise  S
wouldn't be open.  (A point  z  on an edge or at a corner does not have
the property that some disc about  z  is entirely contained in  S.)

On the other hand, does the point at which  gamma  terminates belong
to the image of  gamma ?  It has to:  gamma  is a continuous function
on a closed interval, so its value at the end of the interval must be
the value that it is approaching as it gets near that end.

But this means that that point of the image of  gamma  does not belong
to  S.  So the hypothesis of the Paving Lemma is not satisfied.

----------------------------------------------------------------------
You ask, concerning the "unjustified fear" referred to in the middle of
p.39, in the proof of the Paving Lemma, what sort of path could
force the paving discs to shrink so that one never got to the end
of the path.

Well, of course, given that the Lemma is eventually proved, one knows
that no path can force this.  The point the authors are making, though,
is that if we simply construct a sequence of values  t_0 < t_1 < t_2 <
...  in  [a,b],  we have no guarantee that it will eventually reach  b.
It could converge to some value  < b,  or even converge to  b  but
never reach it.  Indeed, if we choose our  t_n  poorly, this will
happen.  To prove the Lemma we have to show that by choosing them well,
we can make them both form a paving, and reach  b.

The proof is nontrivial.  Can you find an easier proof? (not using
results about compactness from 104)

----------------------------------------------------------------------
You ask what the motivation of the Paving Lemma (p.39) is.

Well, I think the authors do a good job of motivating it in the
second sentence on p.22.  The one thing that they don't explain there
is why the "new path" that one gets in the way they sketch can replace
the old, arbitrarily messy path in the computations we will be making.
But they can't explain this now, because we haven't yet proved the
properties of complex integration that will show that complex integrals
are not affected by "local" changes in their paths.

----------------------------------------------------------------------
You ask whether in the definition of path connectedness on p.40,
the phrase "given  z_1, z_2 \in S" means "for all z_1, z_2 \in S"
or "for some z_1, z_2 \in S".

It means "for all  z_1, z_2 \in S".

----------------------------------------------------------------------
You ask for clarification of the concept of the complement of a path,
p.43 top.

The complement of a subset  S  of  C  is defined on p.24, right after
the Example.  It means the set of points of  C  that are not in  S.
For a vivid way of visualizing this, imagine that in a picture of a
subset S  of  C  we color the points of  S  black and leave the
background white.  Then a picture of the complement would be the
"photographic negative of" the picture of  S.  (This is an imperfect
way of describing it, because it doesn't handle things like using
dotted lines for boundary points not in a set; but it gives a first
idea.)  Now the definition given on p.43 simply says that the authors
will use the phrase "complement of gamma" as shorthand for "complement
of the image of gamma".

You also ask how the complement of a path can be connected.  An example
is given in the "one-component" picture on that page.  In that picture,
the complement of the path consists of all points of  C  except the
thick dark line.  Do you see that that set is connected?

----------------------------------------------------------------------
You ask about the element  t_0  occurring in the second display
on p.44.

Well, the parenthesized condition "(t_0\in [a,b])" does not show
explicitly whether it means "for all  t_0\in [a,b]" or "for some
t_0\in [a,b]".  You have to decide that from context, and in this case,
since they are saying that  mu  attains its bounds, and the first part
of the display says that  k  is a bound for  mu(t),  the second half
of the display must say that this bound is attained.  Hence it must
mean "k = mu(t_0) for _some_  t_0".

(In my mathematical writing, I try to be more explicit about how
elements are quantified than the authors are here.  I recommend that
you do so too.)

----------------------------------------------------------------------
You ask, in connection with the statement on p.51 that Cauchy sequences
are convergent, what a Cauchy sequence is.

It is a sequence  (p_n)  satisfying the condition stated on the
second and third line of that page, namely that for every  epsilon > 0
there exists a positive integer  N  such that  m, n > N  implies
|p_m - p_n| < epsilon.

The result that every Cauchy sequence converges is true in  R^k  for
all  k.  (Hopefully you saw this in some form in 104.)

----------------------------------------------------------------------
You ask about the third-from-last display on p.55.

The conclusion of that display, putting the inequalities together, is
that for  r > N,  |a_r| < rho^(r-N) |a_N|.  Recalling that  0 < rho < 1,
we see that this means that after the Nth term, the numbers  |a_r|
decrease at least as fast as a geometric progression with ratio  rho,
and we know this converges.  To make the argument precise, we can let
K = rho^(-N) |a_N|,  so that our formula becomes  |a_r| < K rho^r.
Since  Sigma_r rho^r  converges, the comparison test tells us that
Sigma_r a_n   converges.

----------------------------------------------------------------------
You ask about the place on p.57 where the authors say "by the
definition of  lim sup ...".

Well, looking closely at that point, I see that the authors have
made one serious misstatement, and one error of judgement.  The
misstatement is easy to correct:  Where they say "for all  n  larger
than some  N" they should simply have said "for infinitely many  n".
(The inequality need not be true for all  n  larger than some  N;
e.g., it might just be true for all even  N,  or all primes.)

The error in judgement is that they should have remembered that
different texts use different definitions to get to the same concept.
The definition we used in my 104 last semester is not directly
applicable, but the property that I mentioned in class is -- that
if we take any number less than the lim sup, then infinitely many terms
of the sequence are  >  that value.  Since  rho > R, we have
1/rho < 1/R,  so infinitely many values of  |a_n|^1/n  are  > 1/rho.

Thanks for pointing this out!

----------------------------------------------------------------------
You ask how the authors get from the second display on p.61 to
the third, using Figure 3.2.

As I said in class, the dot in the (m,n)th position in that figure
represents the product  a_m b_n.  Thus, each  c_n  is represented
by the sum of the diagonal string of dots going from  a_0 b_n  (upper
left) to  a_n b_0  (lower right).

Hence in the second display on that page, the sum of  c_n  from
n = 0  to  2N  is represented by the large lightly shaded triangle,
including the darker square.  On the other hand, the number subtracted
from it in that formula is the sum over that dark square; so when one
does the subtraction, one is left with the two smaller shaded triangles.
The absolute value of the sum over each such triangle is  _<  the sum
of the absolute values, which is in turn _< the sum of the absolute
values of the terms of the small square to which the triangle can be
completed.  Now if you look at the third display on this page, you can
see that the first product-of-sums-of-absolute-values is precisely this
sum for the lower-right square, while the second is this sum for the
upper-left square.  The last term is not part of the computation based
on the figure; it comes from the inequality (i) on the preceding page.

----------------------------------------------------------------------
You ask why, in the proof of the chain rule, pp.66-67, it would be
a problem if f(z)-f(z_0) were zero, "especially since it seems like
f(z) and f(z_0) aren't really used in the proof and that only
g(z) and g(z_0) are".

As I mentioned in class, in the bottom part of p.66, all the
"g(g(...))"s are typos for "g(f(...))".  (I think I compounded the
confusion by saying "f(g(...))" instead of "g(f(...))".)  Likewise the
denominators should have been  f(z) - f(z_0).  So points where
f(z) = f(z_0)  are a real problem.

> I'm also not sure of how the second method fixes the problem

The idea of the second method is to replace the formula at the bottom of
p.66 (once the "f"s have been put where they should be) by the
corresponding formula with "fractions cleared".  However, that formula
can then no longer be expressed as a limit formula, since one has
multiplied by something that goes to 0  as  z --> z_0.  Rather, one
gets a formula saying that the numerator on the left differs from the
formula we get on the right by something that goes to  0  as  z --> z_0
faster than  f(z) - f(z_0) does.  That is expressed by bringing in
a function  h(f(z)),  which --> 0, and multiplying it by  f(z)-f(z_0).
(Remember that u = f(z_0).)  Finally, as the last step (given in words),
one divides by  z-z_0 (which is "safe", because unlike  f(z)-f(z_0),
this cannot take the value  0  except at  z = z_0), and goes back to a
formula in limits.  You should write out the equations that the authors
refer to in words at that final step -- they leave it for you to do
because it is most impressive when you do the calculations yourself.

----------------------------------------------------------------------
You ask about the  h  in the top half of p.67.

It is defined by the two displayed equations -- the first gives
the value at points other than  u;  the second gives the value at
u  itself.  (That second equation could have been written
"h(w) = 0  when  w = 0", but the way they wrote it is simpler!)

> if f(z)=u then h(f(z))=h(u)=undefined, no?

No -- by that second line of the definition, it is defined as  0.

As  w --> u,  the ratio that forms the first term in the definition
of  h  approaches  g'(u),  by definition of  g'(u);  so subtracting
the second term, we get  h(w) --> 0.  Since  h(u)  has been defined
to be  0,  we conclude that  h  is continuous at  u.

> why is h defined that way?

It is an "error term" for the approximation of  g(w)  near  w = u,
g(w) ~  g(u) + g'(u)(w-u).  Specifically, it makes the equation

	g(w) = g(u) + (g'(u) + h(w)) (w-u)

hold.  (You should check this, both for the case  w not-= u  and
the case  w = 0.  For the latter case, any value would work, but
using the value  h(u)  which makes  h  continuous at  u  makes
the rest of the proof work.)

Substituting  f(z)  for  w  in the above equation, and moving the
g(u)  to the left-hand side, gives the last display of the proof.

----------------------------------------------------------------------
Regarding the Cauchy-Riemann equations (p.67), you are surprised that
the rate of change in one direction can determine the rate of change
in the perpendicular direction.

Well, when a function of  z  approaches a limit  L  as  z --> z_0,
then for every  epsilon  there exists a delta such that the value
of the function at  z  differs from the value of the function at
z_0  by less than  epsilon  --  no matter what direction  z  lies
in relative to  z_0!  The definition of the derivative of a complex
function (p.64, bottom) is a statement about the limit of a ratio,
so once we assume that this limit exists, we can determine it by
approaching  z_0  from any direction; and the limit we find must
then determine the behavior when we come from other directions.

Moreover, when we come from a perpendicular direction, the denominator
in that ratio is, roughly speaking,  i  times what it is when we come
from the original direction, so for the ratio to approach the same
limit, the real and imaginary parts of the complex function must be
changing in a way different from, though determined by, the way they
change as we approach from the original direction.

So if there is something strange, it is in the very idea that we
should expect a function to have a derivative in this sense.  The
"Math 53" version of differentiability of functions of 2 variables
is much less restrictive.

Yet the same computations that work for functions of a single real
variable show that polynomials and power series in a complex variable
have derivatives in the "math 185" sense, and thus satisfy the
Cauchy-Riemann equations.

----------------------------------------------------------------------
You ask whether the the Cauchy-Riemann equations (p.67) apply to
functions on  R^2  as well as to functions on  C,  saying "It doesn't
seem that the results  du/dx = dv/dy  and  dv/dx = -du/dy  would follow.
Is this correct?"

The question is, "follow from what?"

I emphasized in class on Wednesday the difference between
"differentiability in the sense of Math 53" and "differentiability
in the sense of Math 185".  They are different conditions, a
weaker one and a stronger one.  The weaker one does not imply the
Cauchy-Riemann equations, the stronger one does.  In fact, the
stronger condition, which we are studying, is equivalent to the
weaker one together with the Cauchy-Riemann equations.

----------------------------------------------------------------------
Regarding the comment below the middle of p.68, you ask

> If D'Alembert knew about the Cauchy-Riemann equations in 1752, then
> why didn't he publish his work and take the credit?

I would guess that he didn't consider them especially important.  I
don't even know that he didn't publish them -- if they just appeared
incidentally in his published work, but if Cauchy made them into a tool
for systematically obtaining important properties of complex functions,
then it would be natural for people to associate them with Cauchy's
work.

Once one has the definition of the derivative of a complex function,
anyone familiar with partial differentiation who works with the
concept is likely to come up with that pair of equations.  Discovering
what one can do with them is another matter!

Naming rights in mathematics are something that some people worry
about, but most people using these terms don't think that much about
who they are named after; the names are simply convenient labels.

----------------------------------------------------------------------
You ask what the benefits of Lemma 4.5, p.69 are to what we are
learning.

The words preceding it are "We begin with a technical lemma".  By
saying this they are implying (in several ways: by using the word
"begin", and by calling it a "lemma") that it will be a tool in
what they are starting to do.  So look ahead at the next main result,
the Theorem on the following page, and you will find what use they make
of this lemma -- in the first step of the proof of that Theorem.

----------------------------------------------------------------------
You ask how, in proving Lemma 4.5, p.69, the authors can use
the Mean Value Theorem, when they have only assumed the partial
derivatives of  u  defined at the one point  (x,y).

They ought to say that the partials are defined in a neighborhood
of  (x,y)!  This is what they are implicitly assuming.

----------------------------------------------------------------------
You ask about where the term  epsilon(h,k)  comes from in the
proof of Lemma 4.5, p.69.

The display before (3) on that page, where  epsilon(h,k)  is
introduced, should be considered the definition of that function.
We are looking at  x  and  y  as fixed, and  theta  has been chosen
in a way depending on  h  and  k,  so the left-hand side of
that equation depends only on  h  and  k,  and so can be denoted
epsilon(h,k).  The statement that  epsilon(h,k) --> 0  as  h, k --> 0,
made on the next line, is a consequence of the continuity of the
partial derivative in question.

----------------------------------------------------------------------
You ask about the computation at the middle of p.70, beginning
"Using the Cauchy-Riemann equations ...".

I see there is a small error in the formula the authors give:
They have left out the coefficient "i" in the terms with  epsilon_2
and  eta_2.  However, that has no effect on the argument of the proof,
since all they do with those terms is to show that they are not too
large, using estimates of their absolute values, which are not affected
by the coefficient  i.

The nontrivial calculation is the one involving the partial derivatives.
To get this right, collect together the two terms involving  h  and
partial derivatives, and collect together the two terms involving  k
and partial derivatives, factoring out the  h  and  k.  You will see
that the former term that you get is exactly the "h  part" of the
answer the authors give.  The "k  part" looks all wrong; to fix it,
apply the Cauchy-Riemann equations to both partial derivatives shown
there.

Let me know whether you find that this works!

----------------------------------------------------------------------
You ask about the change from "z - z_0" to "h+ik" in the proof of
Theorem 4.6, p.70.

This actually goes back to the first line of the proof.  From that
line, we are to understand that we are writing  z_0  as  x_0 + i y_0,
and  z  as  (x_0 + h) + i(y_0 + k).  (If I were doing the writing
I would say this explicitly; but the authors take it for granted
that the equation at the beginning of the proof makes it obvious --
which it does if one pays attention to it.)  Once one has seen this,
the equality  z - z_0 = h+ik  that you ask about follows.

----------------------------------------------------------------------
You ask why in the last phrase on p.70 they specify that the
partial derivatives are continuous at (0,0), when they are in
fact continuous everywhere.

That example is given to show that differentiability can hold at a
single point without holding anywhere else.  To prove differentiability
at (0,0) by the preceding lemma, they need to know the partials are
continuous there.  So they emphasize that continuity holds at that
point even though it is a special case of the fact that it holds
everywhere.

----------------------------------------------------------------------
> I would like to know the difference between a proposition and
> a theorem.

It's subjective.  Theorems are "main results", propositions are
"secondary results", lemmas are results proved not for their own
sake but to help prove something else.  But if a lemma used in
proving something else turns out to be useful in proving many other
things, it could just as well be regarded as important and called
a theorem.  So it's up to the author.

There's the story about the Chair of a math department somewhere who
decided that he was going to be smart, and not base promotions on the
number of articles his faculty wrote (because one long article could
have the same contents as two short ones), nor even on the number of
pages -- but on the number of theorems proved!  Result:  after that,
in the papers of his faculty, there were no lemmas, propositions, or
corollaries -- everything was a theorem.

----------------------------------------------------------------------
With regard to statements like those on p.71, you ask

> When a problem or theorem states that a function f is differentiable
> in some domain D, is it implied that D is a subset of the complex
> field, or that D is any arbitrary set?

The word "domain" unfortunately has two meanings that are relevant
in this course.  In general mathematical usage, the "domain" of a
function means the set of inputs for which the function is defined.
In complex analysis, a "domain" also means a connected open subset of
the complex plane.

In this course, "domain" will almost always have the latter meaning.
In particular, if they say a function has a certain property "in" or
"on" a domain, or if they start out "Let  D  be a domain", this is
almost surely what it means.  But if they refer to "the domain of the
function", then you should worry, and check for clues in the context
(or ask me).

----------------------------------------------------------------------
You ask what it means for a function on a 2-dimensional region to
be constant (p.71).

It means that its values at any two points of the region are the same.

----------------------------------------------------------------------
You ask about the equation on p.71, 2nd line of last paragraph of
proof of Theorem 4.7, "curly-d u / curly-d x = phi' = 0".

They are really combining two facts:  curly-d u / curly-d x = phi',
by the definition of partial derivative, while  curly-d u / curly-d x
= 0  by the first paragraph of the proof.  From these two facts they
conclude that  phi' = 0.

In general, when we know two equations,  A = B  and  A = C,  and
want to show these equations implying that  B = C,  I prefer to
write  B = A = C,  where each step is a known equality and the
chain of steps connects the things we are interested in.  But many
mathematicians write such a situation  "A = B = C" (think: "A = B
and also = C"), as the authors do here.

----------------------------------------------------------------------
Concerning the first paragraph of section 4.4, p.72, you ask what
the difference is between a real-valued function of a complex variable
and a complex-valued function of a real variable.

They're very different:  The first is a function  C --> R,  the
second is a function  R --> C !  (More precisely, in each of these
cases, the domain of the function,  C  or  R,  can be replaced by
some subset, e.g., a domain  D  in the plane, or an interval  [a,b]
in the line; but I wrote the full sets  C  and  R  above to emphasize
the natures of the two sorts of function.)

Thus, when they talk of the variable (real or complex), that word
refers to the input of the function, while when they talk of the
values (real or complex), that refers to the outputs.

----------------------------------------------------------------------
You ask how examples like  f(z) = x^2  are consistent with the statement
on p.72 that if a real-valued function of a complex variable is
differentiable, then it is constant.

The point to remember is the distinction I have emphasized between
"differentiable in the sense of Math 53" and "differentiable in the
sense of Math 185"; i.e., between differentiable looked at as a
function of two real variables and differentiable looked at as a
function of one complex variable.  The function  f(z) = x^2  has the
"math 53" differentiability property, but not the "math 185" property, 
which is what we mean by differentiability here.

To see explicitly that  f(z) = x^2  does not have the latter property,
take the value  z = 1,  and consider the behavior of  (x^2 - 1^2)/(z-1)
as  z --> 1  along the real axis, and along the line  {1+it|t\in R}.

(Incidentally, in that paragraph, the book says "where  D  is an
open set", but they mean "where  D  is a domain".)

----------------------------------------------------------------------
You ask how the authors get the formula for  (1-q)^-2  used at
the top of p.74.

Good question!

There are two ways one can get this.  One of them is to take the power
series for  (1-q)^-1  and square it, using Theorem 3.9, p.60.

Another way is analogous to the way we get the formula for  (1-q)^-1.
Remember that we get that from the curious observation that if you
take a finite sum,  1 + q + q^2 + ... + q^n,  and multiply it by
1 - q,  it simplifies greatly.  Hence that sum can be written as
(simple formula)/(1-q).  If we now let  n -> infinity, then if  |q| < 1,
the thing I called "simple formula" turns out to approach  1,  so the
limit of the quotient, i.e., the sum of the infinite series, is
1/(q-1).

Now to sum the series at the top of p.74, let us take a finite sum
1 + 2q + 3q^2 + ... + n q^(n-1),  multiply it by  1 - q,  and then
multiply again by  1 - q.  I leave it to you to explore what happens.

----------------------------------------------------------------------
Right -- on p.74, near the bottom, "Sigma_1" and "Sigma_2" mean the two
summations on the preceding line.

----------------------------------------------------------------------
You ask whether the reason the authors sketch on p.78, next
to last paragraph, for why one can't patch together different
complex-differentiable functions to get another complex-differentiable
function also shows that one can't do so for any function on  R^2
differentiable in the sense of multivariable calculus.

No.  The point they make is that if a function is differentiable in
the sense of complex analysis, the way the function changes as one
goes in one direction determines the way it changes as one goes in
another.  E.g., if, as we move away from a point  (x_0, y_0)  in the
x-direction,  f(x+iy)  increases at a rate of  3+i  times the rate
of increase of  x,  then as we move away from  (x_0, y_0)  in the
y-direction,  f(x+iy)  will increase at a rate of  i(3+i)  times the
rate of increase of  y;  this is the content of the Cauchy-Riemann
equations.  But nothing of the sort is true of functions
differentiable in the sense of real functions of 2 variables.  E.g.,
if we take the function  f(x,y) = x,  then as  x  increases  f(x,y)
increases at the same rate but as  y  increases  f  is unchanged; if we
instead used  f(x,y) = x+y,  the same statement would be true of what
happens as  x  increases, but the behavior as  y  increases would be
different; so the former behavior does not determine the latter.

(Of course, we haven't yet _proved_ that differentiability in the sense
of this course implies that functions can't be patched together.  This
is, as the authors say, just a glimpse of what we will prove in later
chapters, together with their insight as to what is ultimately behind
the proofs we will see.)

----------------------------------------------------------------------
You ask what it means to raise a real number to a complex power (p.83).

There is no "natural" interpretation analogous to "raising a
number  x  to the n-th power, where  n  is an integer, means
multiplying it by itself  n  times".  On the other hand, if we
interpret the exponential function  e^x  as the solution to a
differential equation, we find that the complex function  e^z
also satisfies that equation.  So some of the ideas motivating
exponentiation with real exponents work for complex exponents,
and others don't.

----------------------------------------------------------------------
You ask about the statement in the middle of p.86, that if  k  is the
greatest lower bound of the set of positive  t  for which  cos t = 0,
then  cos k = 0.

The greatest lower bound of a set of real numbers is always a limit
point of that set, and if  k  is a limit point of the set of  t  at
which a continuous function takes the value  0  then it is easy to
verify, from the definition of continuity, that the function takes the
value  0  at  k.  (To put it another way:  the set of points where the
function is  0  is closed, hence it contains its limit points.)

----------------------------------------------------------------------
You ask how the authors can simply "define" pi, as they do on p.86.

In mathematical writing, a symbol or term can be defined in any
way, as long as we make clear what we mean by it, and use it
consistently.  In practice, one should not make definitions that
conflict with established usages, especially usages that are
relevant in the current context.  But in this case, what the authors
are doing is re-developing the theory of trigonometric functions
without calling on background knowledge from Euclidean geometry.
The "pi" that they are defining will turn out to be the same "pi"
that we learned about in High School geometry.  So they are using
their privilege of defining a symbol to mean anything, to set us on
a road that will rejoin the road of the standard meaning of  pi.

And they don't "pull that meaning out of thin air".  In trigonometry
based on Euclidean geometry,  pi/2  is indeed the g.l.b. of the set
of positive  t  for which  cos t = 0;  so if we define it that way
here, it should give a meaning consistent with the standard meaning.
(There are other ways of characterizing  pi  in conventional
trigonometry; they have chosen this characterization as the one
that will be easiest to prove exists and to use in their context.)

----------------------------------------------------------------------
You ask about the statement on p.88 that "From the standard theory of
alternating series, the sum of the first n terms alternately
overestimates and underestimates the actual limit, allowing us to
make very precise estimates of the trigonometric".

You should look back at the development of alternating series in your
Math 104 text to see that the partial sums will be alternately greater
than and less than the sum of the series (which is what the authors mean
by "overestimate" and "underestimate").  If that text doesn't have such
a development, let me know what book it is, and you can come see me at
office hours for the details.

As for getting "very precise estimates" -- the authors take the
inequalities showing that the partial sums are alternately greater
and less than the sum of the series, and use them, on this page,
to establish the value of  cos 1  to several decimal places.  In
particular, the last two inequalities they show prove the statement
that they make about the value to 4 places.

----------------------------------------------------------------------
You ask about the second half of the proof of Prop.5.1, p.89.

When the authors refer to "properties of the exponential function
established in section 2" they are using the fact that the exponential
function is positive and strictly increasing on the real line.  Since
we know that  e^0 = 1,  if we have  |e^u| = 1  then  u  can't be less
than or greater than  0,  so it must equal  0.

Once we know that  e^v = e^0 = 1,  we can substitute into the
first display of the proof, and conclude that  cos v + i sin v = 1;
let me know if you have any difficulty with what the authors do
after they get that equation.

----------------------------------------------------------------------
You ask about the occurrence in the last display on p.89 of "w",
which has not appeared before in the discussion.

Comparing with the previous display, we see that  w  corresponds to
the  iz  of that equation.  In writing out the equation without
saying what "w" refers to, the authors mean to say that it is an
identity holding for all complex numbers  w.  Since every complex
number  w  is  i  times some other complex number  z,  the fact that
the preceding equation holds for all  z  does indeed imply that this
one holds for all  w,  so that  i rho  is a period of  exp.

----------------------------------------------------------------------
You ask about the computation of the periods of  sin  and  cos  at
the bottom of p.89.

The last display on that page shows that if  rho  is such a period,
then  i rho  is a period of the exponential function.  Hence  i rho
has the form  2 n pi i.  Dividing by  i,  we see that  rho  has the form
2 n pi.

Unfortunately, there are a bunch of typos right below that display:
The last two equations on the page should be "i rho = 2 n pi i" and
"rho = 2 n pi"!

----------------------------------------------------------------------
You ask whether the definitions of  sinh  and  cosh  (p.91) can be
related to the exponential function in a way like that used for  sin
and  cos  at the beginning of the chapter.

Well,  sin x  and  cos x  are formed from the odd and even terms of
the power series expansion of  e^(ix).  Similarly,  sinh x  and  cosh x
are formed from the odd and even terms of the power series expansion
of  e^x.  Check out the power series you get from the definitions of
these functions in the book, and you'll see.

(Also, just as  cos  and  sin  are related to the coordinates of
points on the circle  x^2 + y^2 = 1,  so  cosh  and  sinh  are
related to the coordinates of points on the parabola  x^2 - y^2 = 1.
I'll talk about that briefly in class, if I find some time.)

----------------------------------------------------------------------
You ask who Riemann was.

He's the subject of Chapter 26 in E.T.Bell's "Men of Mathematics".
I haven't re-read it recently, but it's an entertaining book, though
the author is very opinionated about a lot of things, so that you
should take his judgements with a grain of salt.

----------------------------------------------------------------------
You ask what complex integration (p.95) is good for.

One important value will be to reconstruct a function from its
derivative, by the Fundamental Theorem of Calculus for complex
integration.

Another will be as an aid in computing real integrals!  We'll see
how, in section 12.3.

I also sketched on Wednesday a reason why certain sorts of complex
functions should have value at the center of a circle equal to the
average of their values on the circle.  This will be part of a
general technique whereby the value of a function at a point can be
evaluated using an appropriate integral involving its values on a closed
path around that point.

----------------------------------------------------------------------
You are right to point out the choice of the s_r's as a point about
which the text's description of integration (p.96) is ambiguous!

I hope that what I said in class clarified things:  The value of the
sum  S(P,phi)  (or in the Riemann-Stieltjes case,  S(P,phi,theta))
does depend on the choice of s_r's, so a better notation would be
S(P,(s_r),phi),  respectively  S(P,(s_r),phi,theta).  For the integral
to be defined, the inequality  |S(P,...) - A| < epsilon  must hold for
all  P  finer than  Q_epsilon  _and_ all choices of  (s_r)  satisfying
t_r-1 _< s_r _< t_r.

----------------------------------------------------------------------
You ask, regarding the description of the length  L(gamma)  on p.101
as a supremum of lengths of approximating polygons, whether it can
be defined "the limit of the series as  n-->infinity".

The question is -- what series (or sequence)?  For each  n,  there are
infinitely many ways of partitioning  gamma  by  n  points, so we
can't just say "the limit of the lengths of the polygonal paths as the
number  n  of points in the partition goes to infinity", because for
each  n,  there are lots of different lengths.  And, in fact, if we
made "bad" choices of how to partition our path, the lengths would not
approach  L(gamma).

However, one can set up a concept of a "limit over all partitions",
not based on the number  n  of steps, but on the relation of
refinement.  If we have a function  X  from the set of partitions  P
to real (or complex) numbers  X(P),  we can say that a certain number
A  is the limit of the numbers  X(P)  if for every positive real
number  epsilon  there exists a partition  Q_epsilon  such that
all partitions  P  that are refinements of  Q_epsilon  satisfy
|X(P) - A| < epsilon.  This is very close to what the book does, but
to avoid confusing the student with a new version of the concept of
"limit", they don't give it that name; and also, because the  L(pi)
are less than or equal to  L(gamma),  they give their definition using
a supremum instead of a limit.

----------------------------------------------------------------------
You ask why we use lengths of approximating polygons in the definition
of the length of a path (p.103).

It's part of the general approach of defining measurements of
complicated shapes in terms of measurements of straightforward shapes
that we have a formula for.  To find the area under a curve, we
approximate it by unions of rectangles, whose areas we can compute as
"base times height", giving the definition of the Riemann integral.
To find the length of a path, we approximate it with polygons, i.e.,
unions of line-segments, whose lengths we can compute by the
Pythagorean Theorem.

----------------------------------------------------------------------
You ask how they get the inequalities (5) on p.105 using (1).

Good question.  I think that "(1)" should be "(3)".  Of course, when
they write "Putting this with (1)", "this" means "(4)"; so they get (5)
by combining (3) and (4).

----------------------------------------------------------------------
You write, regarding the statement near the bottom of p.105 that a
path will be called "smooth" if it is continuously differentiable,
that you thought "smooth" meant "infinitely differentiable".

There are various kinds of "smoothness".  What definition an author
will make term depends on the condition he or she wants to refer to.

I just did a Google search for "smooth function".  The first hit
says "In mathematics, a smooth function is one that is infinitely
differentiable" -- the condition you had heard.  The second says
that "smooth function" is a synonym for "differentiable mapping" or
"continuously differentiable mapping" -- the definition our authors
use.  The third says "A smooth function is a function that has
continuous derivatives up to some desired order".  That is the general
definition!  The "desired order" depends on the situation.  Of course,
any author needs to state what he or she will mean by the term.

----------------------------------------------------------------------
You ask about the reason for the definition of the "opposite" of
a path, on p.108.

We want to express the idea of "going along the same path but
in the opposite direction".  Personally, I would express this
by letting  -gamma  be defined by  -gamma (t) = gamma(-t),  which
would make  -gamma  have domain  [-b,-a].  But the authors choose
to make it have the same domain as  gamma  does; so they have to
make it take the value  gamma(b)  at  t=a,  the value  gamma(a)  at
t=b,  etc.; so they define it to be  gamma(a+b-t).

----------------------------------------------------------------------
You write that according to Theorem 6.7 (p.109) "the integral of a
continuous function along a closed path is 0".

No, the Theorem does not say that is true for a general continuous
function  f  -- only for one that can be written as  f = F' !
As I emphasized in class today, this property is very special.
It excludes most "random" choices of  f;  though as proved on the
next page, it holds for all functions expressible by power series.

----------------------------------------------------------------------
You ask about the phrase "where W(t)=F(gamma(t))" after the display
just above the middle of p.109.

That phrase defines the function "W(t)" referred to on the right-hand
side of the preceding equation.  In other words, the authors first
define  w(t),  then observe that it is the derivative of a certain
function  W(t).  The final step in the proof to apply the preceding
results, about a general function  W(t)  and its derivative  w(t),  to
the case of these functions.

I recommend defining symbols before one uses them; but mathematicians
often do what the authors do on the line you pointed to:  first state
a fact, then make clear what they meant.

----------------------------------------------------------------------
You ask, regarding Theorem 6.9, p.109, "Doesn't this mean that the
value of the integral of  f  along  gamma  is not dependent on a choice
of  gamma,  except the endpoints?"

Exactly!

But _not_ for all functions  f!  Only for a function  f  which can
be written as the derivative of another function  F.

----------------------------------------------------------------------
You ask, regarding Example 1 on p.109, why the result doesn't depend
on the path.

Because we have just proved the Theorem on that page, saying that if
f  is the derivative of another function  F,  then the integral of  f
over any path is given by a formula which only depends on  F  and the
endpoints of that path!  Up to this point, we haven't known that fact,
so we had to compute integrals by specific computations based on the
path.  If we have a function  f  that is not known to be the derivative
of another function  F,  we still have to do so.  But precisely because
we have this theorem, we can forget about the path if we know a function
F  whose derivative is  f.

----------------------------------------------------------------------
You ask about an interpretation of the Estimation Lemma, p.111.

As I discussed in class last week, integration expresses the
"continuous accumulation" of a quantity.  This lemma expresses
the simple fact that if you know how long a quantity has been
accumulating (the length of the path), and some bound on its
rate of accumulation (i.e., of the function  f), you can get
an upper bound on how much it can possibly have accumulated.
In particular, as I discussed in class today, the key to
proving the hard implication (iii)=>(i) of Theorem 6.11 was the
observation that if we accumulate over a short path (the path
lambda  of length |h|) at a small rate (given by  f(z) - f(z_1)),
then the amount we accumulate is doubly small, and so when divided
by  h  remains small, allowing us to make the estimate needed for
that proof (last display on p.115).

----------------------------------------------------------------------
You ask whether the Estimation Lemma (p.111) means that the length of
the contour used affects the magnitude of the integral.

It affects what we _know_ about the integral.  For some functions,
the integral will vary with the contour, but for others, which we are
particularly interested in, it will only depend on the endpoints, by
Theorem 6.7.  If we have a function of the latter sort, then by using
different contours with the same endpoints, we get different true
information about the same quantity.  So in a given situation, we want
to choose the contour to make the information we get as useful as
possible; e.g., to make  ML  as small as we can.

----------------------------------------------------------------------
You ask about the equality on the third line of p.113.

This is an instance of the equality (11) on p.15:  the absolute value
of a product is the product of the absolute values of the factors.
(The square root of  X^2 + Y^2  is the absolute value of  X + iY.
They could also have written the absolute value of the second factor
as a square root of a sum of squares, but that would not have helped
in this proof.)

Thanks for pointing out the error in signs at the bottom of p.112!
That error doesn't affect the proof:  One should simply change the
"-" to "+" in that line, and the three places where it occurs near
the top of p.113.

Actually, I see that this whole computation can be simplified.
Starting with the equation on the 6th from last line on p.112,

	X^2 + Y^2 = (X-iY)(X+iY) = \integral_a ^b (X-IY)(u(t)+iv(t)) dt

one can observe that since the left-hand side above is real, it must
equal the real part of the right-hand side; then note that the real
part of a complex number is _< the absolute value; so the above
expression

	 = \integral_a ^b re((X-IY)(u(t)+iv(t))) dt

	_< \integral_a ^b   |(X-IY)(u(t)+iv(t))| dt

	 = \integral_a ^b   |X-IY| |u(t)+iv(t)|  dt

	 = |X-IY|  \integral_a ^b  |u(t)+iv(t)|  dt

	 = sqrt(X^2+Y^2) \integral_a ^b |u(t)+iv(t)| dt.
	
This brings us to the display above (13) on p.113, and the proof
can then be finished as the authors do.

This is exactly like their computation, except that they wrote out
what the product  (X-IY)(u(t)+iv(t))  was, and then used the fact that
the absolute value of such a product is the product of the absolute
values of the factors, while as shown above, we can simply use that
fact without expanding the product!

----------------------------------------------------------------------
You ask how, on p.113, display (13) is deduced from the preceding
display.

If  X^2 + Y^2  is nonzero, by dividing by the positive real number
sqrt(X^2 + Y^2).  If, rather,  X^2 + Y^2  = 0,  then (13) clearly
holds.

----------------------------------------------------------------------
You ask how, on p.113, 8th line from bottom, one can speak of the
maximum value of  |f|  on  gamma  tending to zero.

This makes sense because of the preceding sentence, which says
"suppose that  gamma  is a fixed contour, and  f  varies."  So
we are considering not one function  f,  but a family of such
functions.  For each such function, we can find its maximum on
gamma,  and as  f  varies in some way, the values we get for
that maximum can go to zero.  In a particular situation we might
have one function  f_r  for each real number  r,  and then
the statement might be that as  r -> pi,  the number
max_gamma f_r(z)  approaches  0  as a function of of  r.

----------------------------------------------------------------------
You write:

> P.114, Thrm 6.11: I would like to know a little bit more about
> the  D that we are taking. Can it be any D? Does f have to be
> differentiable on all of D?  Are discontinuities of any kind
> allowed?  What restricts  D?

It's right for you to ask these questions -- but after asking them, you
should have set about answering them.  Did you check what is assumed
about  D  in the theorem?  Do you know, solidly, the definition of the
word used in that assumption?  Did you check what is assumed about  f?

Check these points out, and then if you still need to be sure, ask
specific questions like "It says that  f  is ... .  Does that mean
... or ... ?"  Mathematicians are occasionally sloppy, but the basic
assumption on reading a mathematical statement should be that it means
exactly what it says, and that the mathematical terms in the statement
mean exactly what they are defined to mean.  So you should first read
a result under those assumptions, and then, if you feel you have reason
to suspect that something means something different from what is said
(for instance, if it seems that the assumption does not imply the
conclusion without additional conditions; or if something seems to be
used in the proof that you didn't see in the statement), then make a
guess that maybe something is misstated.

Note in particular that the theorem nowhere says that  f  is
differentiable.  Therefore the answer to "Does f have to be
differentiable on all of D?" is that there is no assumption that
it is differentiable _anywhere_ on D.  (It happens that we will show,
much later in the course, that an  f  satisfying these conditions will
be differentiable everywhere on  D.  But that is not something we know
now, and so has nothing to do with understanding this theorem and its
proof.)

----------------------------------------------------------------------
You ask what the usefulness of Theorem 6.11(ii), p.114, is.

First, note that (ii) does not say that that _every_ continuous
function has integral 0 around any closed path.  The Theorem says
that conditions (i)-(iii) are equivalent; so it says that a function
satisfies (ii) if and only if it satisfies (i) and (iii).

So, for example in your homework for this week, you will be computing
integrals of several functions around a closed path.  You will find
that not all of those integrals come to 0; hence you will know from
this that one or more of the functions you are given do not satisfy
conditions (i) and (iii) of that theorem, either.

----------------------------------------------------------------------
You ask about the third display on p.115, where  F(z_1 + h)  is
expressed as the sum of two integrals.

Well,  F(z_1 + h)  is defined as the integral of  f  over a contour
from  z_0  to z_1 + h.  Although they have defined it to be the
integral over a particular contour (line 5 on this page, "we choose
a contour..."), assumption (iii) tells us that we get the same value
for any contour from  z_0  to  z_1 + h.  In particular, if we let
gamma  be the chosen contour from  z_0  to  z_1,  and let  lambda
be the line-segment from  z_1  to  z_1+h,  then  gamma + lambda
is a contour from  z_0  to  z_1 + h,  and using that contour, we
get the value for  F(z_1 + h)  shown in the display.

----------------------------------------------------------------------
You ask how we get the display on p.115 between "so" and "We
therefore have".

In this integral,  f(z_1)  is a constant.  (It is an integral
with respect to the variable  dz.)  So we are integrating the
constant  f(z_1)/h  over the path  lambda  from  z_1  to  z_1 +h.
Applying the formula in the preceding display, we get the answer
(f(z_1)/h)((z_1 + h) - z_1) = (f(z_1)/h)h = f(z_1).

----------------------------------------------------------------------
You ask whether "there any cases where the anti-derivative of a complex
function is different from the anti-derivative of the corresponding
real function".

Well, a complex function isn't the same thing as a real function.
For instance, although the general antiderivative of the real function
x^2  is  x^3/3 + C,  and the general antiderivative of the complex
function  z^2  is  z^3/3 + C,  these are not literally the same
functions, since one has real domain and codomain, while the other
has complex domain and codomain.

But I guess that by "the same function" you mean "the function having
the same name or description".

Well, generally speaking, we name complex functions by analogy with
real functions having the same properties, and which the given complex
function extends; so when we can do this satisfactorily, equations
holding among real functions also hold among the complex functions
with the same names.  But we can't always make this work out.  For
instance, the real absolute-value function  x |--> |x|  has an
antiderivative, x |x| / 2;  but the complex absolute-value function
will turn out not to have an antiderivative.  The real function
x |-> 1/x  has one antiderivative,  log(x),  for  x > 0,  and another,
log(-x),  for  x < 0;  but we will see in the next reading that
defining "nice" analogs of the "log" function on the complex plane
is a complicated business.

----------------------------------------------------------------------
You ask what it means on p.120 when they say that the logarithm is
"multivalued".

It means that for a given value of  z,  there are more than
one values for "the logarithm of  z".

The problem is that by definition, a function isn't allowed to
have that property.  (Some mathematicians have defined "multivalued
functions", which aren't functions in the ordinary sense; but
they're messy to handle.  In earlier times, before mathematicians
settled on what "function" should mean, people would speak of
the complex logarithm as a "multivalued function"; this is why the
authors say "In classical terms ...".)

So the best way to say things is that the idea underlying the
logarithm is multivalued, so that in converting that idea into
(single-valued) functional notation, we end up, not with one
function, but with many functions.

----------------------------------------------------------------------
You are also right that on p.121, first line of last paragraph,
"from 1" should be "from r".

----------------------------------------------------------------------
You ask, regarding the definition of  arg z  on p.122, why the value
z = 0  is excluded.

For any _nonzero_  z,  the choices of value for  arg z  are those
real numbers  theta  such that  e^(i theta) = z/|z|.  These are
"unique up to multiples of  2 pi"; i.e., once we know one of them,
the others are gotten from it by adding multiples of  2 pi  to it.
As discussed in the text and in class, this allows us to choose
particular values from this set so that the argument function is
nearly continuous -- it has to jump by  2 pi  along a certain line,
but will be continuous everywhere else.

On the other hand, if  z = 0,  while it is true that we can write
z = r e^(i theta)  by taking  r = 0,  any value of theta whatsoever
will work; so there is no longer any decent way of choosing one value
among the rest; in particular, there is no way of defining it that
will be continuous at  0.  Moreover, whatever value we chose for
arg 0  would give no information about that point.  So to avoid making
the study of the argument even messier than it otherwise is without
anything gained in doing so, we simply leave it undefined for z = 0.

----------------------------------------------------------------------
You ask whether in this book,  theta  is always taken such that
-pi <  theta _<  pi,  as in the middle of p.122.

No.  When the  theta  consider is  arg(z),  the principal value
of the argument, then it is in that range by definition.  But
in the last paragraph of p.124 they define other branches of the
"arg" function, and these give values in different ranges.  On
pp.127-129 they use values of theta in many different ranges.

----------------------------------------------------------------------
Regarding the definition of  arg_alpha  near the bottom of p.124, you
ask whether this is simply the argument  theta  of the given complex
number chosen from the interval  alpha - 2 pi < theta < alpha.

Right.  Although I see that in the next-to-last display, where
they have "z\in C", they ought to have "z\in C_alpha".  (They've
just gone to the work of defining  C_alpha!)  More generally, one
could define  arg_alpha  on all of  C \ {0},  but then the final
inequality in the last display on that page would be "_<".  (Under
this more general definition, the function would be discontinuous on
the line they call  R_alpha.)

----------------------------------------------------------------------
You ask how the continuity of exp is used just above the middle
of p.126.

They should have said "continuity of  Log".  I hope my explanation
in class showed how that was used.

----------------------------------------------------------------------
You ask about the discussion in the last full paragraph on p.127.

With respect to "getting the picture", it would be best to ask in
office hours, where I can see what sort of picture you have.  By
e-mail the best I can do is start the discussion:  Suppose we decide
that the argument we will use at each point is the one having
values in  [0, 2 pi).  Let's call it  Arg(z).  Do you see that this
function is discontinuous all along the positive real axis?  Do you
see that, as a result, if we have a path  gamma  that crosses the
real axis, then when  gamma  does so, the function  Arg(gamma(t))  will
be discontinuous?  That is what they are saying.

What they go on to show on the next page is that if we don't tie
ourselves to one particular rule for which value of the argument to
choose, but allow this to vary depending on the part of the path
that we are on, then we can choose arguments for each of the complex
numbers  gamma(t)  so as to get a continuous function of  t.

----------------------------------------------------------------------
You ask a large number of questions about Theorem 7.1, p.128.

Many of the answers lie in material on earlier pages.

You ask what is meant by a "continuous choice of argument".  This
is not something whose meaning you are supposed to be able to
guess from the words in the phrase!  Rather, the authors say
what they will mean by the phrase on p.127, second paragraph.  It
is a concept that takes a good bit of digesting; the remainder
of p.127 illustrates the difficulty in making such a continuous
choice (and hopefully gives one an inkling of how it can be done).
Only after thinking about that concept is one ready to read the
theorem.  If you have difficulties understanding that definition
on p.127, ask questions about that!

You also ask about the symbol  C_alpha_r.  I anticipated that
people might forget what that meant, between the place where it
is defined and the later places where it is used, so I told the
class at the end of Monday's lecture to notice that the symbol
C_alpha  is defined on p.124.

I don't say that, even after absorbing the earlier definitions,
the material of this course will be easy!  But reading and
absorbing what has come before is a necessary step to understanding
what comes later.

I know that some people find it harder to learn by studying a
book than by listening to a conventional lecture.  If that is
your situation, I suggest that you do things to make studying
from the book more like attending a lecture:  on the one hand,
try to hear in your mind the "voice" of the authors saying the
words you are reading.  On the other hand, copy into a notebook
lots of notes from the book as you read it; writing them may help
fix them in your mind.

Let me know if these suggestions help!

----------------------------------------------------------------------
Your comment about "taking the arguments of a real number" made me
see that where they write  arg_alpha_1(t_1)  etc., on pp.128-129, this
should be  arg_alpha_1(gamma(t_1))  etc..  Thanks for pointing it out!
I hope it makes more sense after you re-read it with this correction,
noting that the  gamma(t_r)'s  are not real numbers but points of
the path.

----------------------------------------------------------------------
You ask what the winding number, computed as an integral on
pp.130-131, means graphically.

The authors talk about this at the beginning of section 7.4 (p.128):
They observe that as one follows a closed path  gamma  that doesn't
go through the origin from its start to its finish, the total change
in the argument of  gamma(z)  will equal  2 pi  times the number of
times  gamma  winds around the origin.

So it follows that when they compute this change in argument, divide
it by  2 pi, and call the result the winding number, this represents
the number of times  gamma  winds around the origin.  The winding
number of a path that isn't closed is not as striking a concept; one
can simply say that it records the total cumulative angle-change the
path makes relative to the origin, measured as some fractional number
of full circles.

----------------------------------------------------------------------
You ask about the statement at the end of the top paragraph on p.131
that "the imaginary parts add up to  2 pi w(gamma,0)".

The imaginary parts are the numbers

arg_alpha_r(gamma(t_r)) - arg_alpha_(r-1)(gamma(t_(r-1))).

For each  r,  this is the winding number of  gamma_r,  w(gamma_r,0),
since  arg_alpha_r(gamma(t))  is a continuous choice of argument
on that path.  Hence by Theorem 7.2 their sum is the winding number
of  gamma.

----------------------------------------------------------------------
You ask why, in the proof of the formula for the winding number of
a not-necessarily-closed path (p.131), we can "remove" the real parts.

The argument shows that the imaginary part of the integral is equal to
2 pi  times the winding number of  gamma.  So if we drop ("remove") the
real part, we are left with a value that determines that winding number.

It is not this case (of a not-necessarily-closed path) that is
peculiar, but the case of a closed path.  In that case, the argument
they give shows that the real part is zero, so we don't have to
say "take the imaginary part" and then divide by  2 pi i;  all we
have to do is divide the whole expression by  2 pi i.

----------------------------------------------------------------------
You ask why, in the argument at the top of p.131, the real and complex
parts of the integral don't both give  0.

I'm not sure why you think they should.  If you were saying "we have
proved that an integral around a closed path is 0", then I hope you
realize after what I said in class that this is definitely not the
case.  What we have proved is that _if_ a function  f  is the derivative
of another function  F,  _then_ the integral of  f  around a closed
path is  0.  We saw many examples of functions that were not
derivatives of other functions (e.g.,  x^2 + y^2), and others that
were.  The function  1/z  is a very interesting borderline case.
In any sliced plane  C_alpha  it is the derivative of a function,
namely  log_alpha,  but in the whole region where it is defined,
C \ {0},  it is not.  As we integrate  1/z  around the path  gamma,
the integral is, up to a constant of integration, some "log z"
function; but when we get back to the starting point, we find it may
be a different "log z" function from the one we started with,
differing from it by a multiple of  2 pi i;  so the value of the
integral around  gamma  is that number  2 pi i n.  Since the real parts
of different logarithm functions are the same, the real part of this
integral is  0;  since different logarithms differ in their imaginary
parts (by multiples of 2 pi), the imaginary part of the integral need
not be  0.

----------------------------------------------------------------------
You ask why on p.132 they look at the complement of the path  gamma.

That is because only for points  z_0  in the complement of  gamma
is  w(gamma, z_0)  meaningful.  We can't calculate "how many times
gamma  goes around  z_0" (i.e., the winding number) if it doesn't
go around   z_0,  but goes right through it at some point.

----------------------------------------------------------------------
In your answer to your pro forma question concerning p.132, you end
with "The only way an integer-valued function can be continuous is
if it is a constant function".

I hope you realize that this is true only for a function on a connected
set, but that an integer-valued function on a non-connected set does
not have to be constant.

----------------------------------------------------------------------
You ask about the word "telescoped" on p.135.

Have you looked it up in a dictionary?  It means "collapsed together,
shortened".  Old-fashioned hand telescopes were made so that the
parts could be collapsed into each other, for easy carrying.

----------------------------------------------------------------------
You ask about what the authors call "the real crunch" in the rigorous
calculation of winding numbers (p.135, 4th-from-last line).

Whether the verification is hard depends on the path, of course.  An
example that takes some thought is the winding number around zero of
the path  gamma(t) = cos 2t + i sin 3t  for  t\in[0,2 pi] (a relatively
simple Lissajous curve).  More difficult would be paths in which
x  and  y  are polynomials in  t,  since determining where polynomials
go to zero can be hard.  But in the cases one generally works with,
giving the full details is not so much hard, as tedious.

----------------------------------------------------------------------
Concerning the second diagram on p.136, you say that when you trace it
from  D  to  E  with your finger, "it seems to travel counter-clockwise
for half the time and clockwise for the other half".  I would guess that
you are referring to whether the path is bending to the right or to
the left.  But the concept of winding number does not concern that; it
refers to the angular motion with respect to a fixed point, in this
case the origin.  If you follow the path with your finger, thinking
about the direction of the vector from the origin to the tip of your
finger, you will see that this direction has an overall change of  pi
in a clockwise direction as the path goes from  D  to  E  (though for
a couple of brief moments it does go the opposite way).

----------------------------------------------------------------------
You ask how Cauchy's Theorem (p.141 et seq.) relates to Green's Theorem.

Interesting question, but the answer is somewhat complicated.

Green's Theorem says that the line integral of a vector field  (P,Q)
around the boundary of a region of the plane is equal to the double
integral of the function  Q_x - P_y  over the region.  Now given
a complex function  f(z) = u(x,y) + i v(x,y),  if you do the
computations you will find that the real part of the complex integral
of  f(z)  over a path  gamma  is equal to the line integral of the
vector field  (u(x,y),-v(x,y)),  while the imaginary part is equal
to the line integral of  (v(x,y),u(x,y)).  If we form the functions
Q_x - P_y  corresponding to these two vector fields, we see that both
of them will be zero if and only if  f  satisfies two Cauchy-Riemann
equations.  Hence by Green's Theorem, if the real and imaginary
parts of  f  are continuously differentiable and  f  satisfies the
Cauchy-Riemann equations, its integral over the boundary of any region
of the complex plane will be zero.

This is one form of Cauchy's Theorem, given on p.162.  We will see
some slightly different formulations before we get there, but as the
authors will emphasize, they are all ultimately equivalent.

----------------------------------------------------------------------
You ask why in Theorem 8.1 (p.143) they prove Cauchy's Theorem
for triangles, rather than for shapes such as squares or rectangles.

Because from the result for a triangle they can prove the
result for all star domains -- and these include not only squares
and rectangles (for which they could have proved the result just
as easily as for triangles), but also infinitely many shapes
(such as the "blob" of Figure 8.6, p.146) for which it would
have been very hard to prove it directly.  You will see, if
you try to replace the triangles in the proof of Theorem 8.2
by squares or rectangles, that these don't make the proof
go through; so triangles are really the best way to start.

----------------------------------------------------------------------
You ask why, in the last display on p.144, they can say that the
length of the boundary of T_1  is  half the length of the boundary of T.

The triangles  T^(r)  in Fig.8.5  are formed by joining the midpoints
of the sides of  T,  from which one can deduce that they are
all similar to  T,  but scaled down by a factor of  1/2;  hence their
perimeters are each half the perimeter of  T.

----------------------------------------------------------------------
You write:

> p. 145, near the middle: how does the estimation lemma give us (2)?
> It seems more of an application of the mean value theorem.

The Mean Value Theorem is not true for complex functions, so one
can't use that.  The Estimation Lemma serves as a substitute in
some cases.

To apply the Estimation Lemma, one uses the upper bound on the
integrand that they have found in the preceding display, except
that in place of "|z - z_0|" one uses the estimate  L(boundary T_n)
noted in the preceding line.

----------------------------------------------------------------------
You ask why, on p.145, knowing that  a + b z  has an antiderivative
makes its integral over  gamma  equal to  0.

By Theorem 6.11 (p.114), equivalence of (i) and (ii).

That theorem is a very fundamental result for this course; I hope you
will learn it thoroughly, so that any time we see that some function
satisfies one of conditions (i)-(iii), you will immediately be aware
that the other two conditions also hold.

----------------------------------------------------------------------
You ask about the deduction that  c = 0,  near the bottom of p.145.

The preceding equation says that for arbitrary  epsilon > 0, we have
c _< epsilon L(boundary T)^2.  This makes  c _<  the infimum of these
numbers, which is  0.  And a number which is  >_ 0  and also  _< 0
must be  0.

The wording of your question suggests that you thought the authors
were deducing this conclusion only from the facts "epsilon is arbitrary
and c >_ 0".  However, when they have gone through an argument and
gotten a result, you should always expect that that result will be used
(unless it is one of the conclusions of the theorem), and so bear it
in mind in reading what comes next.  In this case, there is even a
hint in the wording:  The sentence you ask about begins with "But",
which means that what is to follow will somehow be played off against
what precedes to get a conclusion.

----------------------------------------------------------------------
You ask about the phrase "If it were possible for  f  to be exactly
linear" on p.146, line 2".

What they mean is "If it were possible to find a neighborhood of each
point in which  f  was exactly linear".  The word "locally" at the end
of the phrase is supposed to signal this meaning, but doesn't quite
get it across!

----------------------------------------------------------------------
You ask what would constitute a "formal argument", such as is referred
to in the last paragraph of section 8.1, p.146.

All the proofs we see in this course are (more or less) formal
arguments.  What the authors are referring to that is not a formal
argument is the sketch in the preceding paragraph on that page, which
asserts that by estimating the conflicting effects of increasing numbers
of triangles and decreasing failure of linearity on small triangles,
one could give a different sort of proof of Theorem 8.1.

Actually, I've thought about how one might do that, and I don't
see a way, because, as I mentioned in class, we don't know that
for every epsilon there exists one delta that will make the second
display of p.145 hold for all  z  and  z_0.  (The choice of  delta
in the proof given is only known to work for one specific  z_0.)
I don't know whether the authors had a method in mind, or wrote this
comment without thinking things through.

----------------------------------------------------------------------
As you observe, a star-domain (p.146) can have more than one
star-center.

One can think of the term as meaning "a point which one may take as
a `center' from which to send out `rays of light' that reach all
points of the domain."  Looked at this way, "center" does not imply
uniqueness.

----------------------------------------------------------------------
Your pro forma question, concerning p.146, is worded "Is the star
domain true for all z ?"

You need to be careful to give more precise mathematical statements!
Though I won't deduct credit for sloppy wording in your questions of
the day, if you used wording like that on an exam or the homework, you
would certainly lose points!  What you meant could presumably be
expressed "In a star domain, must every point  z  satisfy the condition
stated in the book for  z_*?"  (Or: "Is every point of a star domain
a star center?")

----------------------------------------------------------------------
You ask about the transition from the first display on p.147 to
the second, saying that the authors seem to be assuming  F  is
an antiderivative of  f,  and turning two of the integrals into
"F(z_1) - F(z_*)" and "F(z_1+h) - F(z_*)" (then cancelling the  F(z_*)
terms), though we haven't yet proved that  F  is an antiderivative
of  f.

No -- the authors are simply using the _definition_ of  F  made
in the statement of the theorem.  Namely,  F(z)  is defined to be
\integral_[z_*,z] f  (the integral of  f  along the straight-line
path  [z_*,z]).  Since the first and last integrals in the first
display on p.147 are along straight-line paths from  z_*,  this
definition can be applied.  But we are not assuming  F  to be an
antiderivative of  f,  so we cannot do that sort of substitution
for any other paths -- for instance, for the middle integral in
that equation.

I see that the task of reading the statement of the theorem is subtle:
After "then F(z)", the symbol "= \integral_[z_*,z] f" and the phrase
"is an antiderivative of  f" occur in immediate succession, yet the
first is to be read as a definition, the second as a conclusion of
the theorem.  The reader must deduce that the first is a definition
from the fact that  F(z)  has not been defined before in this theorem,
hence nothing could be asserted about it.  In my own writing I prefer
to use wording like "Let  F(z) = \integral_[z_*,z] f.  Then  F  is an
antiderivative of  f".  But many people prefer the sort of wording used
in the book, for its brevity.

----------------------------------------------------------------------
Yes, the reference to "Theorem 5.11" after the first few displays on
p.147 should say "Theorem 6.11"!

----------------------------------------------------------------------
You ask how they get (3) on p.147.

The expression  (F(z_1 + h) - F(z_1))/h  on the left-hand side
of (3) equals the integral of  f(z)/h  on the right-hand side,
while the  f(z_1)  on the left equals the integral of  f(z_1)/h
on the right -- that step is the application of the previous
expression.

----------------------------------------------------------------------
You ask whether the phrase "independent of the choice of contour" in
the last line of p.147 refers only to contours within  D.

Right!

They used the phrase "in D" in referring to closed contours on the
preceding line; they should have repeated it on this line when
referring to general contours.

----------------------------------------------------------------------
You ask about the choice of  1  as star-center of  C_pi  on p.148.

As you note, every point of the positive real axis is a star-center.
Each such choice of star-center would give an antiderivative of  1/z.
These antiderivatives all have the form  (Log z) + c.  The authors'
choice of the point  1  gives the exact function  Log z  that they
want, instead of any of the others.  So it was a little bit of
behind-the-scenes prearrangement to make the result come out nicely.

----------------------------------------------------------------------
You ask, first, how we know that the integral of Lemma 8.4 (p.149)
exists.

The existence of integrals of continuous complex-valued functions
on contours was developed in section 6.2.
----------------------------------------------------------------------
You ask why Lemma 8.4 on p.149 (and Cauchy's Theorem on p.153)
requires  f  to be differentiable.

The proof of Lemma 8.4 uses Corollary 8.3, which requires  f  to
be differentiable; and the proof of Cauchy's Theorem in turn uses
Lemma 8.4!  (And as to why Corollary 8.3 requires differentiability,
look at what its proof uses.  You should be able to trace the use
of differentiability back to where it is actually used directly in
a proof.)  This whole chapter is based on developing properties of
differentiable functions, each successively stronger result building
on the results proved before it.  None of these results is true
for nondifferentiable functions.

----------------------------------------------------------------------
You ask what the authors mean on p.151 when they speak of "taking
-nu_n copies of the opposite contour  -boundary R_n".

This is what they say to do in case  nu_n  is negative.  So, for
instance, suppose  nu_n = -3.  Then  -nu_n = 3,  so their rule becomes
"Take 3 copies of  -boundary R_n".  I.e., writing  beta = boundary R_n,
this is the contour  3 (-beta) = (-beta) + (-beta) + (-beta).
Now by their convention on p.36, 3rd paragraph, "-beta" is a contour
that runs over  beta  backwards, and by their convention on p.34,
"adding up" contours means going through them one after another;
so  (-beta) + (-beta) + (-beta)  makes sense.

The point of this gimmick is to reduce to the case of positive
coefficients, so that they can use the definition that a positive
integer times a contour means the sum of that many "copies" (i.e.,
repetitions) of that contour.

However, as I indicated in class, all that matters is that
nu_n boundary R_n  stand for a combination of edges with integer
coefficients, in which each edge occurs  nu_n  times as many times
as it occurs in  boundary R_n.  What the authors describe is
a convenient way of getting a contour with the right number of
repetitions of each edge.  Once we have our combination of edges,
however we construct it, we can "integrate" over it by taking the
integrals over the separate edges, multiplying them by the integer
coefficients, and adding.

----------------------------------------------------------------------
You ask what they mean, on p.151, last sentence, about allowing
q  to be negative.

Remember that if  L  is an edge traversed in one direction, then
-L  denotes the same edge, traversed in the opposite direction.
So if, for instance, the edge  L  is traversed 3 times in the
given direction and 5 times in the opposite direction, they call
this "-2 L".

The point of this way of counting things is that all we are
interested in these edges for is to integrate over them!  So
if we have a formula which represents an integral over  sigma,
expressed as a sum of integrals over intervals, or a formula which
represents a sum of integrals over boundaries of rectangles,
likewise expressed as a sum of integrals over intervals, then
to count the contribution to this formula of the integrations
that go over  L,  we multiply the the integral over  L  by
number of times we integrate over it in its proper direction
minus the number of times we integrate over it in the opposite
direction.  The result of that subtraction, which we call  q,  can
be positive or negative.

----------------------------------------------------------------------
You ask, regarding the first long display on p.152, why the
last two terms on the left-hand side are there.

That display represents the sum of the integrals of  f  over the
expression  B  given in the preceding display.  Now  B  is defined
in terms of  A;  if you check back to what  A  is, it consists of
the rectangles  boundary R_n  with coefficients  nu_n,  minus the
contour  sigma.  When we integrate over the sum of the terms
"nu_n boundary R_n",  we get the first  k  terms of that display.
The fact that we subtract  sigma  is represented by the next-to-last
term, while the final term corresponds to the final term
-q boundary R_s  in the definition of  B.

----------------------------------------------------------------------
In your question about the proof on p.152, you refer to "the
variable q".  But it's not a variable, it's a specific number,
defined on the third-to-last line of p.151 to mean the number of
times the edge  L  occurs in the last display on that page; in
other words, the number of times it appears in the combination of
rectangles shown, minus the number of times it appears in  sigma.

We want to show that that combination of rectangles is effectively
the same as  sigma;  namely, that the two objects contain every
edge the same number of times.  So given an edge  L,  we let  q
denote, as noted, the difference between the number of times
L  occurs in the two expressions.  In the computation on p.152 we
show that  q = 0,  i.e., that  L really does occur the same number
of times in both places.  Since we show this for all  L,  the
two objects are indeed effectively the same.

----------------------------------------------------------------------
You ask about the two middle displays on p.152, and which winding
numbers in them can be nonzero.

One thing that may not be obvious is that the list of  k
summands at the beginning of each equation includes a summand
nu_s w(boundary R_s, ...).  I.e., though there is an  R_s  term at
the end, that doesn't mean that such a term is omitted from that list.
(This can be seen from the fact that the list expresses the winding
number of "B", which contains "A", which as shown on the preceding
page involves all rectangles from  R_1  to  R_k.)

So now, if we look at the first equation, we see that the winding
number of  R_n  about  z_s  will be  0  except for  n = s,  where
it is  1;  thus the s-th term of that equation will give  nu_s.
The term  -w(sigma, z_s)  will give  -nu_s  (see first display on
p.151, which defines the terms  nu_n),  cancelling what precedes.
That leaves only the term  -q w(boundary R_s, z_s) = -q . 1 = -q,
verifying the equation.

In the second equation, the r-th term cancels the sigma-term, while in
the last term, w(boundary R_s, z_r)  (cf. correction noted on the
homework for today) gives  0;  so the sum is  0.

----------------------------------------------------------------------
You ask why one has to assume  gamma_1  and  gamma_2  have the
same winding numbers about all points of the complement of  D  at
the start of section 8.6 (p.153).

So that, after joining  gamma_1  and  -gamma_2  by a cut, we will
have a contour whose winding number about all points of the complement
of  D  is  0.  (So that Cauchy's Theorem will be applicable.)

----------------------------------------------------------------------
You ask whether Fig.8.13, p.154, doesn't show an application of
Theorem 8.9 in which the sum of the winding numbers of  gamma_1
and  gamma_2  around some points not in  D  is nonzero.

I hope what I said in class cleared this up:  One applies that theorem,
not to  gamma_1  and  gamma_2,  but to  gamma_1  and  -gamma_2 !

----------------------------------------------------------------------
You ask whether the statement that if  f  is a differentiable function
on a domain  D,  and  gamma_1,  gamma_2  are two closed contours
which have the same winding number about every point not in  D,  then
the integrals of  f  over  gamma_1  and  gamma_2  are equal is true
in all cases (p.154).

Yes.  If you find that amazing, then I am happy that you are
discovering the miraculous quality of complex analysis!

But I ought to point out a related point of view from which it isn't
so surprising.  If  f  has an antiderivative in  D,  then both those
integrals must equal zero, so of course they are equal.  So the above
result says that a function that is differentiable has a behavior close
to that of a function with an antiderivative -- though its integrals
around closed paths may not be zero, but they only depend on how many
times the path winds around points outside the domain.

----------------------------------------------------------------------
You ask about the second picture in Fig.8.16, p.156.

What this shows is two Jordan contours, one of which surrounds the
half-ring on the lower right side of the picture and the other the
half-ring on the upper left side.  Because the two contours go in
opposite directions along the straight-line segments, if we sum the
integrals of any function over those contours, the summands coming from
those segments cancel, and the sum of these integrals is equal to the
sum of the integrals over the two circles in part (a) of the picture.

----------------------------------------------------------------------
You ask whether, in the definition of  I(gamma)  and  O(gamma)
on p.156, the points of  gamma  itself are considered to belong
to one or the other.

Good question!  They are not.  The winding number is only defined
on the complement of  gamma,  so those sets consist of points of
that complement.  Where the authors wrote  "{ z \in C | ... }"
they should have written "{ z \in C \ gamma([a,b]) | ... }".

----------------------------------------------------------------------
You ask why on p.156, the region  O_2  is considered to be "outside"
the path shown.

As I said in class when discussing that reading, the meanings defined
for "inside" and "outside" in that section don't entirely agree
with the everyday senses of the words.  But that isn't uncommon
in mathematics; we need words to describe mathematically relevant
concepts, and if the everyday meanings of English words don't do
so, we make technical definitions to give them the meanings we need
to express.  On that page, "inside" and "outside" are defined in terms
of winding numbers, so to check whether  O_2  is inside or outside of
the path, you should compute the relevant winding numbers.

----------------------------------------------------------------------
You ask whether the proof of the converse direction of Theorem 8.11,
p.157, involves some general idea.

I would say yes:  "In a context where you can prove integrals of
differentiable functions around closed paths to be zero, you can
deduce that winding numbers are also zero, by using a differentiable
function whose integral around a closed path is the desired winding
number."

This method was also used at the end of the proof of Cauchy's
Theorem, to show that if  gamma  and  sigma  are related as in
Lemma 8.4, and  gamma  satisfies the hypothesis of Cauchy's Theorem,
then so does  sigma.

----------------------------------------------------------------------
Regarding the definition of simple connectedness of p.157, you write

> ... I was wondering if something could be simply connected but
> not connected or the other way around.  ...

The "something" you are referring to is presumably a domain, since
the book only defines "simply connected" for domains.  But if you
were wondering whether a domain could be simply connected but not
connected, it sounds as though you have forgotten the definition
of "domain", so you should look it up and review it.

As to the opposite question, of whether a domain  D  could be
connected (which is automatic) but not be simply connected --
the answer should be clear if you notice why the book is
introducing the concept of simple connectedness.  We have been
investigating conditions under which the statement "the integral of
a differentiable function around a closed contour equals  0" will
hold.  From the results of the preceding sections, the condition
given in the definition of "simply connected" are what we need
on a domain  D  for that statement to hold for all functions and
all closed paths in  D.  On the other hand, I have repeatedly
emphasized that that statement that the integral is zero is not true
for all domains, all closed contours, and all differentiable functions;
in particular, not for the function  1/z  on the domain  C\{0}.
Hence,  C\{0}  is an example of a domain that is _not_ simply
connected.

----------------------------------------------------------------------
You ask why the first display on p.160 holds.

The authors make this their definition of the integral of  f  over
a not-necessarily-smooth path  pi.  If something is a definition,
it doesn't have to be proved true.  One can ask whether it is a
useful definition, and, if it uses symbols suggestive of symbols
that have already been defined, whether the definition matches
what those symbols suggest; and finally, one can ask whether the
entity given by the definition is well-defined.  This definition
does indeed match what the symbols suggest, because in the case
where  pi  is a contour, we already know that the equality holds.
The important question of whether it makes the integral well-defined
(if it didn't then it wouldn't be a legitimate definition) is addressed
on the bottom half of that page.

I hope you understand the difference between a definition and a
result!  Although in ordinary life, the main purpose of definitions
is to explain how words are (already) used, in academic fields,
especially mathematics, they are used to introduce new usages that
will be useful in studying a topic.  So when a definition is given,
the question is never "Why is it true?", but rather, such things
as "Why is it useful?  How should I picture it?" and, if some entity
is defined in a way involving a choice (in this case, the choice
of  lambda), "Is the resulting concept well-defined?"

----------------------------------------------------------------------
You ask why they can say in the 2nd and 3rd line of p.161 that
\integral_lambda f - \integral_mu f  = 2 pi i.

Because  \integral_lambda f - \integral_mu f  = \integral_(lambda-mu) f,
and  lambda-mu  is a closed path which goes once clockwise around
z_0,  hence has winding number  1  around that point.

This is something you should learn to do automatically in this
course -- when you see a sum of integrals over various paths, check
whether they can be put together and regarded as an integral over a
combined path.

----------------------------------------------------------------------
You ask how one would check that, as stated on p.161, the theorems of
Chapter 8 hold for arbitrary paths, under the extended definition
of path integral.

One would get these extended results from the results proved in
Chapter 8.  Given an integral over an arbitrary path  pi,  one
approximates it by a contour  lambda  as on p.160, shows that if
pi  satisfies the hypotheses of one of those results, then so will
lambda,  applies the result as in Chapter 8 to lambda, and from its
conclusion (and the definition of the integral of  f  over  pi  as
the integral of  f over  lambda) concludes that the result is also
true for  pi.

----------------------------------------------------------------------
You ask what a space-filling curve (mentioned on p.161) is.

One can construct a continuous map from the interval  [0,1]  to the
plane whose image is the unit square,  {(x,y) | x\in [0,1], y\in [0,1]}.
By definition, a continuous map on  [0,1]  is a curve -- but the image
of this "curve" is two-dimensional.  One can also do the same thing
in higher dimensions.  One calls such strange examples "space-filling"
curves.

The construction of one such curve is outlined in Chapter 7, Exercise 14
(p.168) of Rudin's "Principles of Mathematical Analysis", a text used
by several sections of Math 104 last Semester and this Semester.
Another example is given in Steinhaus's "Mathematical Snapshots",
pp.84-86, which shows several steps in a sequence of curves that
approach a space-filling path as their limit.

----------------------------------------------------------------------
You ask whether the existence of more points in
I(\boundary phi) \ phi(R) than just the one point  z_0  referred
to in the first line of the proof of Lemma 9.1 (p.163) would make
the proof fail.

The proof does not assume that there is _only_ one such point !
It says that if there is or are _any_ such point(s), we choose
one, and call it  z_0.  The authors then define  D  to be  C \ {z_0},
and eventually get a contradiction.  Thus, they conclude that there
are no such points.

----------------------------------------------------------------------
You ask what they mean at the bottom of p.163 when they say
"let  phi^(r)  be the restriction of  phi  to  R^(r)".

If  f: X --> Y  is any function between sets, and  X_0  is a
subset of  X,  then "the restriction of  f  to  X_0" means
the function  f_0: X_0 --> Y  defined by  f_0 (x) = f(x)  for
all  x\in X_0,  but undefined at all points of  X \ X_0.
At first glance, it might seem to be "the same function" as  f;
but important properties can change on restricting a function to
a subset of its domain:  A function  f  that was not one-to-one
can have a restriction  f_0  that is one-to-one, and the restriction
will often have a smaller image-set than the original function.

----------------------------------------------------------------------
You ask what the purpose of getting  z_1  is on p.164, start of
second paragraph.

It is the same idea as the choice of  z_0  on p.145 line 5.  In each
case, we show that if a certain integral is nonzero, then enough
of its nonzeroness can be "concentrated near a point" to get a
contradiction.  But on p.145 this had to be done with careful
estimates of absolute value, while here, we just use the fact that
a nonzero integer has absolute value  > 1  and the contradiction
follows easily.

----------------------------------------------------------------------
You ask how we know, in the last paragraph of the proof of Lemma 9.1
on p.164, that for some  N  the rectangle  R_N  will belong to
N_delta(z_1).

Each time we subdivide, we get rectangles of 1/2 the linear dimensions
of the rectangles at the step before.  Thus, if  d  is the length of
the diagonals of the original rectangle  R,  the rectangles after  N
subdivisions will have diagonals of length  d/2^N.  By taking  N
large enough, we can get  d/2^N < delta.  Since the diagonal is the
greatest distance between two points of a rectangle, any two points
of  R_N  are at distance  < delta apart.  So as  z_1  is a point of
R_N,  every point of  R_N  is at distance  < delta  from  z_1,  i.e.,
R_N  is contained in  N_delta(z_1).

----------------------------------------------------------------------
You ask why we need to begin the proof of Theorem 9.2 (p.164) by
cutting  R  into little rectangles.

Because we don't know that  phi(R)  itself is contained in any
disc in  D;  but by cutting it into small enough subrectangles
we can insure that each of these has image in such a disc.

(The authors don't say explicitly how we know that if we divide
R  into small enough rectangles, each will have image in some
disc in  D.  I didn't have time to go into that today, but I'll
do so on Wednesday.)

----------------------------------------------------------------------
You ask why Stewart and Tall suggest thinking of  [a,b]x[0,1]  as
a subset of the complex plane  C  rather than as a subset of the
real plane  R^2  (p.166).

In terms of the ideas involved, the real plane would certainly
be more appropriate, since we aren't using the concepts that are
special to complex numbers as distinct from pairs of real numbers
(complex multiplication, and the resulting definition of complex
derivative).  I can only assume their reason is pedagogic:  They
may have found that their students got confused if they talked
about a "distinction" between two things,  C  and  R^2,  that
consisted of the same elements; and, since that distinction wasn't
needed in the study of homotopy, they may have decided to ignore it.

----------------------------------------------------------------------
You ask how the paths referred to at the top of p.167 join together
to give a closed path.

Let us look at Fig.9.11 to be concrete.  I think from your wording that
you think they are referring to _all_ the paths shown in that picture.
But they are not:  It is the four paths labeled  gamma_0,  tau,
gamma_1,  and  rho  (oriented in the appropriate directions) that make
up "\boundary phi".  Hopefully, if you look at those, in succession,
you will agree that they join together to give a closed path, and that
will answer your question.  It remains only for you to reread the
first sentence at the top of p.167 and see that this does indeed
describe that closed path, and to think through how, not merely in
that example, but given any continuous map  phi: R --> D,  one gets a
closed path "\boundary phi".  Fig.9.12 shows a slightly more complicated
example; Figures 9.7 and 9.8 on p.163 are further illustrations.

----------------------------------------------------------------------
You ask how in the situation discussed in the second paragraph
of p.167, we know that  rho  and  tau  can be made to go to a
single point.

When they say "Insist that each of  rho  and  tau  goes to a
single point", they don't mean that we start with a function
where they don't go to a single point and change it!  They mean
that we restrict attention to the case where each of  rho  and
tau  is a constant function, i.e., "goes to a point".  (Because in
that case, we can describe the integral around  \boundary R  in
terms of  gamma_0  and gamma_1  alone.)

----------------------------------------------------------------------
You ask whether people look at homotopies that don't satisfy either
of conditions (a) and (b) on p.167.

Well, the homotopies we've been looking at are between maps
[a,b] --> D.  Topologists look, more generally, at homotopies
between maps  X --> Y,  where  X  and  Y  are topological spaces
(say, metric spaces).  In that context, they often look at
them without any additional conditions.

In fact, a closed path in a domain  D  can be regarded as a
continuous map from the unit circle (generally denoted  S)  to  D,
and then "closed path homotopies" can be regarded as unrestricted
homotopies between maps  S --> D.

But for maps  [a,b] --> D  (D  a domain), unrestricted homotopies
are of little interest, because any such path can be shown homotopic
by an unrestricted homotopy to any other.

----------------------------------------------------------------------
You ask why the proof of Theorem 9.3 (p.168) introduces the point
paths  p_0  and  p_1.

The fact that  gamma_0  is fixed endpoint homotopic to  gamma_1
means that there exists a map  phi  from the rectangle [a,b]x[0,1]
to  D  sending one edge to  gamma_0  and the opposite edge to
gamma_1,  and sending the remaining two edges to single points.
To prove Theorem 9.3, we apply Theorem 9.2 to that  phi.  The
conclusion of Theorem 9.2 concerns an integral over  \boundary phi,
and this boundary is the sum of four paths.  In this situation, two
of these are the point paths you refer to.  Even though they are
not visible in the drawing of  \boundary phi,  they are paths, each
describing the value of  phi(\boundary R(t))  as  t  moves over a
certain interval, and to state of Theorem 9.2 for this case we
must include them in the equation.  Of course, the fact that they
"don't go anywhere" immediately makes the integrals over these paths
zero, so after dutifully mentioning them, we can drop those terms
from the equation.

----------------------------------------------------------------------
You ask what in Fig.9.15 (p.169) correspond to the "rho" and "tau"
of Fig.9.11.

The dotted curve within the system of curves corresponds to both!

Just as in Fig.9.11, for each value of  s,  the line in the rectangle
at the left from the point  (a,s)  to the point  (b,s)  is mapped
by  phi  to a path starting at a point of  rho  and ending at the
corresponding point of  tau,  so in Fig.9.15, for each value of  s,
the line in the rectangle at the left from  (a,s)  to  (b,s)  is
mapped by  phi  to a closed path starting and ending at the same
point of the dotted curve.  (They emphasize this by making one such
line and the corresponding curve dark.)

----------------------------------------------------------------------
As you suggest, the two sides of the cuts in the second picture on p.174
are separated simply to make it easier for the eye to follow the
modified curve.  They should be thought of as the same line segment.

----------------------------------------------------------------------
You ask how the authors go from Fig.9.19 to Fig.9.20 (p.174).

They are combining the method of homotopy with the method of cuts,
which they used on pp.153-154 to get from "Cauchy's Theorem" to
the "Generalized Cauchy's Theorem".  That method involved inserting
in a path  gamma  certain pairs of paths  sigma  and  -sigma,  whose
contributions cancel out, so that the integral of any function over
the new path will equal the integral over the original  gammma.  If
you didn't thoroughly absorb that idea then, you should review it
before looking at what the authors do on p.174.  There, they take a
path  gamma  which is not itself homotopic to zero, and after inserting
some cuts "sigma  and  -sigma", get a path  pi  which is homotopic to
zero.  Since the integral of a differentiable function  f  over a
closed path  pi  which is homotopic to 0 equals 0, and since any
integral over  gamma  equals the integral of the same function over
pi,  it follows that the integral of any differentiable function over
gamma  must equal  0.

----------------------------------------------------------------------
You ask whether there is anything like the Cauchy Integral Formula
(p.178) for real functions.

For arbitrary (or even arbitrary infinitely differentiable) real
functions, no.  But there are certain sorts of functions of  n  real
variables which when  n = 2  turn out to be exactly the functions
which can occur as real or complex parts of a differentiable function
of a complex variable; and analogs of Cauchy's Theorem do hold for
these.  Such functions are considered briefly in our text, in
section 13.4 (p.249); they are the solutions to the Laplace equation
given at the start of that section.  But our book doesn't discuss
the version of the Cauchy Integral Formula that they satisfy.

----------------------------------------------------------------------
You ask how the authors get the inequality  |F(z)| < M  at the
top of p.179.

We know that as  z --> w,  we have  F(z) --> f'(w).  So taking any
positive epsilon,  we know that for  z  sufficiently close to  w
we have  | F(z) - f'(w) | < epsilon.  This gives  |F(z)| <
|f'(w)| + epsilon,  so defining  M = |f'(w)| + epsilon  we get the
|F(z)| < M  as required.

----------------------------------------------------------------------
You ask why the restriction "epsilon < delta" is needed for the
second display on p.179 to hold.

The preceding display asserts an inequality for  0 < |z-w| < delta.
That inequality is applied in the display you asked about to
bound  |F(z)|  on the circle  S_epsilon.  Since that circle consists
of points  z  with  |z-w| = epsilon,  we must assume  epsilon < delta
to be able to say that the preceding display applies.

----------------------------------------------------------------------
You ask why, as stated in the next-to-last sentence of the proof
of Lemma 10.2, p.181, "the integral is unchanged as  r  is varied
in the range  0 < r < R".

This can be justified in either of two ways:  By Theorem 9.4, as
you suggest, or by Theorem 8.9 on p.154, applied as discussed in
the two paragraphs preceding that theorem.

----------------------------------------------------------------------
You ask what the restrictions should be on  h  in Theorem 10.3 on p.181.

The authors indicate the restrictions when they say that the Taylor
series expansion is valid "in any disc  N_R(z_0)\subset D".  Looking
at the equation for the expansion, we see that  f  is being applied to
z_0 + h,  hence they must mean that  z_0 + h  is in that disc, hence
that  |h| < R.

----------------------------------------------------------------------
Anyway, you ask about the  r  for which the formula for  a_n  at
the bottom of p.181 is valid.

It is valid for all  r  such that  0 < r < R.  In computing  a_n  for
different  n,  we could use the same or different values of  r;  the
values we get won't depend on the  r  that we choose, as noted in the
last few lines of the proof of Theorem 10.2.

----------------------------------------------------------------------
You ask, in connection with the discussion at the bottom of p.182,
isn't every differentiable function  R --> R equal to the sum of
its Taylor series?

Definitely not!  Our authors pointed this out on p.77, in the sentence
after the first display, "But this Taylor series does not equal  F
..."; and in the third homework assignment you (hopefully) did the
computations supporting their statement.

In Real Analysis, something is proved about a function being
given by its Taylor Series -- but not that this is _always_ true.
You should go back to your 104 notes and see what it was that
was proved!

----------------------------------------------------------------------
> Why is that some infinitely differentiable real functions are not
> equal to their Taylor series but no such complex functions exists?
(p.182, next-to-last paragraph).

I can't give any satisfactory answer except that we've proved it!

Intuitively, one can say that a function like  e^{-1/z^2}  is able
to "hide" the fact that it "blows up" as  z --> 0  from the view of
real analysis, because the blowup occurs off the real axis, but it
can't hide it from complex analysis; so it is able to provide a
counterexample in real analysis, but not in complex analysis.  This
interpretation is based on the assumption that the function on the
complex plane is the "true" function.  The only justification I can
give for such a viewpoint is to say that the things we've proved about
differentiable functions on domains in the complex plane give us so
much knowledge about them that we find we can understand the
peculiarities of related real functions best by looking at them as
restrictions of complex functions to the real line.

I hope this helps.

----------------------------------------------------------------------
You ask about the statement in the second paragraph of p.183 that a
complex function  f  is differentiable if and only if it is analytic.

The hard direction, "only if", is Theorem 10.3, p.181.  The easy
direction, "if", is Theorem 4.12, p.74.

We have different words, "differentiable" and "analytic", because
the concepts are not equivalent for functions of a real variable.
But in the case of a complex variable, by the above observation, the
two words describe the same class of functions, and so are synonyms.

----------------------------------------------------------------------
Concerning the observation on p.183 that Morera's Theorem is a partial
converse of Cauchy's Theorem, you ask whether the full converse of
Cauchy's Theorem is true.

Interesting question.  Yes, the full converse is true:  If  f  is
a continuous function on a domain  D,  with the property that for
every closed contour  gamma  in  D  which does not wind around any
point outside  D,  one has  \integral_gamma f = 0,  then  f  is
differentiable.

To see this, take any point  z_0  in  D,  and let  N  be a disc
in  D  containing  z_0.  Then  N  is a simply connected domain, so
every contour  gamma  in  N  has the property of not winding around
any point outside  N,  and so in particular has the property of not
winding around any point outside  D;  so by the assumption on  f,
the integral of  f  is zero around every contour in  N.  So
applying Morera's Theorem to  N,  we see that  f  is differentiable
in  N;  in particular, it is differentiable at  z_0,  which we started
by choosing as an arbitrary point of  D.  So  f  is differentiable
in  D.

Probably the reason that this full converse to Cauchy's Theorem is
not given is that restricting attention to curves  gamma  which do not
wind around any point outside  D  complicates the statement.  We could
not do without that condition in Cauchy's Theorem itself, because
without it the theorem failed; but dropping this condition from the
converse gives a formally weaker rather than a stronger statement,
hence one that is still true.  And as shown above, the full converse can
be deduced from it.

People make "straightforward" changes in the statements of theorems
when writing texts according to their feeling of how the subject is
most elegantly developed.  It would be interesting to see whether there
are texts in which statement of "Morera's Theorem" has the above
stronger form.

(Sent later:) It turns out that Sarason's text gives a form of Morera's
Theorem which is not weaker than, or equivalent to, but stronger
than the converse to Cauchy's Theorem.  He gives it as an exercise
in the middle of p.81; it says that  f  will be differentiable (in
his language: holomorphic) if we merely assume that the integral of f
around every rectangle which, together with its interior, is contained
in the domain on which  f  is defined.

----------------------------------------------------------------------
You ask about the statement at the beginning of the last paragraph
on p.183, "If  f  is differentiable ... then all the higher derivatives
of  f  exist."  That is by the first sentence of Theorem 10.3, p.181.

----------------------------------------------------------------------
You ask whether Cauchy's estimate,  |f^(n)(z_0)| <_ Mn!/r^n  (p.184)
holds "for all values  z_0  in  C  or for a specific  z_0  at the center
of C_r".

The statement, with quantification shown, is the following:  Suppose
f  is a function on a domain  D,  z_0  is a point of  D,  R  is
a constant such that  N_R(z_0)  is contained in  D,  r  is any
constant such that  0 < r < R,  and  M  any upper bound for  |f(z)|
on the circle of radius  r  around  z_0.  Then for every nonnegative
integer  n,  we have  |f^(n)| _< M n! / r^n.

So in summary, the result is true for all  z_0,  but after choosing
z_0  we must choose a circle  C_r  having  z_0  at its center  (not
an arbitrary circle), and a bound  M for  f(z)  on that circle,
before we can state the inequality.
----------------------------------------------------------------------
I hope what I said in class explained why they could say in the
middle of p.185 that | P(z) / z^n | >_ 1/2 for |z|>k:  They are
calling on the "epsilon-delta" version of the preceding limit
statement, with epsilon = 1/2.  (Though for limits as  |z| --> infinity,
it is, strictly speaking, an "epsilon-k" statement rather than
"epsilon-delta".)

----------------------------------------------------------------------
You ask about the "z_0" introduced just below the middle of p.185.

The previous computation has concerned points outside a circle of
radius  k,  so now they consider what happens inside that circle,
saying "For  |z_0| _< k ...".  For such a  z_0,  they apply Cauchy's
Estimate using a circle  C_4  about that point of a radius  R  "which is
so large that"   C_R   lies in the region for which we already have
the inequality.

----------------------------------------------------------------------
You ask how, in the proof of Theorem 10.7, p.185, one can choose  R
to make the next-to-last display of that proof hold.

What the authors mean would have been clearer if they had started
a new sentence with the word "then" that comes before that display.
They mean that _if_  R  is chosen large enough so that the preceding
condition holds, _then_ the display will hold.  Indeed, the preceding
condition says  |z| >_ k  for all  z  on  C_R,  while the previous
display shows that any point  z  satisfying  |z| >_ k  has
1/|P(z)| _< 2/k^n.  Hence this will hold for all  z  on  C_R,  as
claimed.

----------------------------------------------------------------------
In connection with the discussion on p.186, you ask about zeroes of
infinite order.

The authors don't actually define that term; but if we take it to
mean "a zero that isn't of finite order", then a differentiable
function  f  on a domain will have a zero of infinite order if
and only if it is identically zero.

----------------------------------------------------------------------
You ask what it means for a function  f  on a domain  D  to be
identically zero (p.186, Corollary 10.9).

It means that  f(z) = 0  for all  z\in D;  in other words,  f  is
the zero function on  D.

You add "Does it mean that the function is equivalent to zero but
not necessarily defined as  f(x)=0?"  Two functions are equal if
they have the same domains and the same values at each point of the
domain.  What formulas or words were used to define them does not
make a difference here; so we don't have a concept of "equivalent"
such as you were suggesting.  Functions are either equal, i.e., the
same function, or not equal.

----------------------------------------------------------------------
You ask why at the bottom of p.186 the errata sheet says to change  S
to  S \ {z_0},  and whether this is defined if  z_0  is not in  S.

If we merely chose a sequence in  S  which tends to  z_0,  this might
be the constant sequence  z_0, z_0, z_0, ...,  and having such a
sequence of zeroes would not contradict the zero  z_0  being isolated,
so we could not complete the proof.

As for whether  S \ {z_0}  is defined in the case where  z_0  is not
in  S;  yes.  For any two sets  X  and  Y,  the set  X \ Y  is defined
to be  {p\in X | p\not-in Y}.  If  X  and  Y  are disjoint, this set
is just  X.

----------------------------------------------------------------------

You ask whether the set  S  in Prop. 10.10 and Theorem 10.11 (p.187)
must have a limit point in  S  itself or merely in  D.

It only needs to have a limit point in  D.  (The condition that really
matters is that the set  {z\in D|f(z) = 0}  in Prop.10.10, respectively
{z\in D|f(z) = g(z)}  in Theorem 10.11, should have a limit point in  D.
That limit point will belong to the set, by continuity.  The authors
are introducing a subset  S  of those sets; this may be helpful in
applications of the result, where one may not be able to find all points
satisfying the stated equations, but can produce a particular set  S
of such points which has a limit point in  D.  But whether that limit
point belongs to the set  S  one has produced is irrelevant, the point
is that its existence shows that the set described above has a limit
point in  D,  and in fact in itself.)

----------------------------------------------------------------------
You ask about the statement in the proof of Proposition 10.10, p.187,
that "By continuity,  f(gamma(s)) = 0".

To show  f(gamma(s)) = 0,  it will suffice to show that for every
epsilon > 0,  |f(gamma(s))| < epsilon.  Let  epsilon  be chosen.
By continuity there exists  delta  such that  |s-t| < delta ==>
|f(gamma(s)) - f(gamma(t))| < epsilon.  Now by choice of  s  (see
preceding paragraph in the book) we can find  x\in [a,s]  arbitrarily
close to  s  with  f(gamma(t)) = 0  for all  t\in [a,x],  in particular,
for  t = x.  Taking such an  x  with  |s-x|<delta  we have
|f(gamma(s)) - f(gamma(x))| < epsilon,  and since  f(gamma(x)) = 0,
this gives  |f(gamma(s)|<epsilon, as claimed.

You also ask why we can't, by repeated application of this argument
alone, get the conclusion that  f(gamma(t)) = 0  for all  t.

Well, once we prove  f(gamma(s)) = 0  for a given  s  as above, where
do we go from there?  After proving that, the least upper bound of the
set of  t  for which we know that  f(gamma(s)) = 0  is still  s!
Conceptually, the process of going to the least upper bound allows us
to go from a set that may be "open at the top" to one that is "closed
at the top", but it doesn't give us a way of going any farther from a
set that is "closed at the top".  For this we need the method of the
next sentence of the proof, which calls on Corollary 10.9 -- a fact
that is special to differentiable complex functions, rather than
being true of all continuous complex functions.

----------------------------------------------------------------------
You ask about the use of the word "analytic" in the statement of
Theorem 10.11, p.187.

As the authors say at the top of p.183, once Theorem 10.3 is known,
"analytic" and "differentiable" are equivalent for functions defined
on a domain in the complex plane.  So you should think of them as
synonyms, and the authors' choice to use one or the other as irrelevant
to the interpretation of what they are saying.  (I don't know why they
chose to say "analytic" at just this point.)

----------------------------------------------------------------------
You ask what the authors mean on p.190, just before Prop.10.12, that
the problem of finding maxima becomes "easy / impossible".

Well, if things of a certain sort do not exist, then the problem "Find
the set of such things" has an easy solution, "the empty set", while
the problem "Find such a thing" has no solution.  So depending on how
one poses the problem, it is easy or impossible to solve.  The authors
are just being playful in saying the same thing in these two apparently
contradictory ways.

----------------------------------------------------------------------
You ask how the authors get the middle inequality in the display
before (2) on p.190.

Well, as the errata-list shows, the superscript "2" on the
second integral should be "2 pi", as on the first one.  With
that correction, the integrals are the same except for the
replacement of  |f(z_0 + r e^(it))|  by  |f(z_0)|.  But we have
assumed that  z_0  is a local maximum of  |f|,  so the former
term is everywhere  _<  the latter.

----------------------------------------------------------------------
You ask what happens to the formulas in Theorem 11.1 (p.196) when
R_1 = 0,  so that  z - z_0  can  = 0.

Well on the one hand, the first errata sheet tells you to change the
"_<" signs in the definition of "annulus" on this page to "<" signs.
Please note in your copy of the text all the corrections made in both
errata sheets.  There's no point in having known typographical errors
interfere with your understanding of the material.

Secondly, even if the annulus were allowed to include the point  0,
the "z" in the integral formulas is not an arbitrary point of the
annulus; it is the variable of integration, hence a point of the
path of integration,  C_r,  and since the authors write (correctly)
R_1 < r < R_2, the circle  C_r  around  z_0  has positive radius, so
a point of  C_r  cannot equal  z_0.

----------------------------------------------------------------------
You ask whether Laurent expansions (Theorem 11.1, p.196) exist for all
functions.

As Theorem 11.1 says, a function  f(z)  differentiable in an annulus
{z | R_1 < |z-z_0| < R_2}  will have a Laurent expansion valid in
that annulus.  If we look at a function which is not defined in all
of some annulus, or is defined there but not differentiable, it will
not, in general, have a Laurent expansion in the annulus.  A function
may have different Laurent expansions in different annuli, even if
these have the same center, as you will find in Exercise 2 in this
week's homework.

----------------------------------------------------------------------
You ask about the use of Cauchy's integral formula at the top line
of p.198, when the function isn't defined on a disc about z_0.

The disc that Cauchy's Theorem is being applied to is a small disc
around  z_0 + h.  Note that the formula displayed shows the integral
over  S_epsilon,  and look for  S_epsilon  in the picture on the
preceding page.

[Later note:  it turned out that the student's difficulty was that
the function  F  in the formula on p.198 wasn't defined in the disc
used there.  The explanation is that the authors are applying Cauchy's
integral formula to  f,  not  F.  F(z)  in that computation
corresponds to  f(z)/(z-w)  in the Cauchy integral formula.]

----------------------------------------------------------------------
You ask how inequalities (i) and (ii) in the middle of p.198 are
obtained, saying you got the reverse of those inequalities.

As I pointed out in Math 104, there are two ways to use the triangle
inequality:  In geometric terms, if a triangle has sides  a, b, c,
we can say both that  a _< b + c  and that  a >_ b - c.  The first
is the "obvious" application of that inequality; the second is gotten
by rearranging the inequality  b _< a + c.  In the present situation,
consider the triangle with vertices  z_0,  z_0 + h,  and  z,  where, as
in (i), we take  z  on  C_r_2.  To make the second version of the
triangle inequality noted above applicable, we should call the side of
that triangle with endpoints  z  and  z_0+h  "a",  call the longest
side, the one with endpoints  z_0  and  z,  "b",  and call the remaining
side, with endpoints  z_0  and z_0+h  "c".  Since  z  is on the circle
C_r_2,  the length of "b" is  r_2,  while of course the length of "c"
is |h|.  The length of "a" is the left-hand side of the inequality (i)
that we want to prove.  The version of the triangle inequality noted
above gives

		|z - (z_0+h)| >_ r_2 - |h|.

Comparing with what we want, we see that we just have to apply
the assumption  |h| < rho_2  (equivalently,  -|h| > -rho_2)  to
get (i).  Inequality (ii) is proved similarly, except that since
we again want the side "b" of our triangle to be the longest one,
in this case we should take this to be the side with endpoints  z_0
and  z_0+h  and let "c" be the side with endpoints  z_0  and  z.

I worded the first part of the above discussion in a way that emphasized
geometric thinking.  To get the displayed inequality above purely
algebraically, use the triangle inequality in the form

		|z - z_0| _< |z - (z_0+h)| + |(z_0+h) - z_0|,

then note that the leftmost term is  r_2  and the rightmost term
is  |h|,  and solve this inequality for the remaining term.

----------------------------------------------------------------------
You ask whether it is important to understand long proofs like that
of Laurent's Theorem (pp.196-199).

Yes.  If its length and the number of tricks in the proof are
intimidating, you can rewrite it as a series of lemmas, each of
which has a shorter proof:  One lemma could say that if  A  is an
annulus about a point  z_0  as shown in Fig.11.1 and  z_0+h  is
a point of  A,  and if  g(z)  is a differentiable function on
A \ {z_0+h},  then the integral of  g(z)  over a small circle
S_epsilon  around  z_0+h  is equal to the difference of the integrals
around two large circles  C_r_1  and  C_r_2  as shown in that figure.

A second lemma would apply the above result to a function of the
form  g(z) = f(z)/(z-(z_0+h))  where  f(z)  is differentiable on the
whole annulus, and combine with Cauchy's integral formula for  f(z)
in the circle  S_epsilon,  to get formula (1) on p.197.

A  third lemma could show that the first integral in (1) is equal
to an appropriate Taylor series (p.198, displays just below middle
of page).  A final lemma would evaluate the second integral in (1),
getting the last display in the proof, on p.199.  Put together,
these results give Laurent's Theorem.

Each of these lemmas involves just one or two basic ideas.

----------------------------------------------------------------------
You ask whether the fact that the Taylor series of  e^(-1/z^2)  about
z = 0,  is identically  0  means that its Laurent expansion only
involves terms with negative exponents.

Actually, there is no Taylor series expansion of  e^(-1/z^2)  about
z = 0.  There is a Taylor series expansion of  e^(-1/x^2)  about  x = 0
for a real variable  x,  which as you say is identically zero.  But as
I discussed in class, we get such a series because our looking only at
real  x  "hides" the fact that  e^(-1/z^2)  "blows up" at  z = 0  for
non-real values.

The Laurent expansion of  e^(-1/z^2)  does involve one term with
non-negative exponent  n,  namely the  n = 0  term.  It happens
not to involve any terms with positive exponent, because of the way
it is constructed, but if we vary the function a little bit, for
instance, taking  e^(z - 1/z^2),  we get a function which still has
restriction to the real axis with Taylor series identically zero, but
which now has Laurent series expansion with infinitely many nonzero
terms with positive exponent.

----------------------------------------------------------------------
You ask about the statement that the two series shown in the middle
display on p.200 converge absolutely for  R_1 _< |z - z_0| _< R_2.

The key fact is Lemma 3.7, p.56, so review that lemma carefully if
you are not familiar with it.  The lemma applies more or less directly
to the second of the series you asked about (with  n-m  in place
of the  n  of the lemma and  z-z_0  in place of the  z).  To apply it
to the first of these series, we put  (z-z_0)^-1  in place of the  z
of the lemma.

The one catch is the assumption in the lemma that the series converges
for some point  z_1  with  |z_1| > |z|.  To see this in the case
of the series for  F_2,  given  z  take a point  z_1  such that
|z - z_0| < |z_1 - z_0| < R_2.  Then  z_1  will still lie in the annulus
where the given series are assumed to converge, and (recalling that
we are applying the lemma to  z - z_0  in place of  z) the required
relation of absolute values is satisfied.  To get the corresponding
condition in the formula for  F_1,  use a point  z_1  with
R_1 < |z_1 - z_0| < |z - z_0|.

----------------------------------------------------------------------
You ask about the concept of singularity (p.201) in the case where the
function is defined at the point  z_0.

As I said in class, the authors should have included in the definition
of (isolated) singularity the condition that  f  be undefined at  z_0.
The example of  (sin z)/z  is typical of the situation that motivates
the concept of removable singularity.

Of course, one can take a differentiable function  f  which is defined
at  z_0  and form a new function  g  defined to equal  f  at points
other than  z_0  and be undefined there; and this will have a removable
singularity at  z_0.  Moreover, since people often don't worry about
being precise in distinguishing functions that differ by minor changes
in domain of definition,  g  might even be denoted by the same symbol
as  f.

In practice, one only speaks of a removable singularity in situations
where one starts off with a function undefined at  z_0  whose behavior
as one approaches  z_0  one doesn't know, and which one then proves in
one way or another to behave nicely there, so that the singularity can
be "removed".

As for having a function defined at a point, but not differentiable
there (e.g.,  f(z) = z  if  z is nonzero,  1 if z = 0;  or
f(z) = 1/z  if  z is nonzero,  0 if z = 0), this would rarely come
up:  in complex analysis one almost always deals with functions that
one knows (or proves) to be differentiable wherever they are defined.

----------------------------------------------------------------------
You ask how to tell the nature of a singularity (pp.201-202) without
writing the function as a Laurent series.

Lemma 11.2 and Proposition 11.4 give such techniques.

----------------------------------------------------------------------
You ask about the first display on p.203.

This is gotten by applying the Estimation Lemma (p.111) to the
last display on p.202.  As to why there is no "i" along with
the "1/2 pi", this is because they are taking the absolute value,
and the absolute value of  1/(2 pi) i  is  1/(2 pi).

----------------------------------------------------------------------
You ask, regarding to last sentence on p.203 ("Conversely ..."), how
we know that the limit is finite so that we can apply Lemma 11.2.

The author's are inconsistent as to whether they specify that a
limit is finite; in Lemma 11.2 they did, while in this proposition
they didn't.  But one should assume that a statement that a limit
of a C-valued functions exists means that this limit is a member
of  C,  not "infinity", unless it is explicitly stated that the
value infinity is also allowed.  So in Proposition 11.4, the value
infinity is not allowed for  l  -- if we allowed it, the proposition
would not be true.

----------------------------------------------------------------------
You ask what the "one-to-one correspondence" referred to in the
second line of p.206 is.

It is the correspondence implicit in the beginning of that sentence:
a point  (x,y)  in  C  corresponds to the point  (xi, eta, zeta)
of  S^2  other than the north pole at which the line through the
north pole and  (x,y,0)  cuts  S^2.

----------------------------------------------------------------------
You ask what point of  S^2  corresponds to the origin in  C  under
the correspondence discussed on p.206.

One finds it just as for other points of  C:  draw a line from the
north pole of the sphere through the origin, and see at what other
point it crosses the sphere.  The answer is the south pole, (0,0,-1).

----------------------------------------------------------------------
I'm trying to see why you feel it would be difficult to use the
definition saying that  f  has a given sort of singularity at infinity
if and only if  f(1/z)  has the same sort of singularity at  0  (p.207)
to easily prove results about these conditions (p.208, third paragraph).

Maybe you took "prove" to mean "prove from scratch".  What they
actually mean is that after translating a statement about singularities
at infinity into a statement about singularities at finite points,
one can apply results we have gotten in previous sections about
singularities at finite points to establish that statement.  E.g.,
Lemma 11.2 immediately yields criteria for a singularity at infinity
to be removable.  Make sense?

----------------------------------------------------------------------
You ask about Example 4 on p.207.

First, "1/sin(1/z)" is definitely correct as it stands, since by
definition, the function that we look at to describe the type of
singularity   f(z)  has at  z = oo  is  f(1/z)  (not  1/f(z)).

As you note, the behavior of "1/sin(1/z)" near  z = 0 does not match
any behavior they have described earlier.  If I were they I would have
discussed such an example earlier.  It is a singularity that is _not_
isolated, hence falls outside the scope of section 11.2; but they
could have noted it there as an example of a function that falls
outside that scope, rather than waiting, and making it seem like
something special that comes up when we look at singularities at
infinity.  After all,  sin(1/z)  similarly falls outside the scope of
section 11.5.
----------------------------------------------------------------------
You ask what it means for a function to be rational, as referred to in
Proposition 11.8, p.208.

Did you have a difficulty understanding the definition given in the
preceding three lines, "Recall that a rational function is one of the
form ..." ?  Or did you not notice it?

If you did not notice it, that may mean you are not reading the book
carefully enough.  "Skimming" does not work when reading mathematics!

In any case, if when reading a mathematics text you discover that
you do not recognize a word that is being used, the first thing you
should do is go over the immediately preceding paragraphs and see
whether it is defined there.  If you do not find it there, you should
look in the index.  Indices of math texts are often very poor in showing
where concepts are used, but are generally very good at showing where
they are defined.  If you had looked up "rational function" in the
index, it would have referred you back to this page, showing you that
you need to search that page again for the definition of the term.
Note that terms being defined are generally put in italic (or in some
texts, boldface) type, so they are easy to spot.

If, on the other hand, you did read that definition but had difficulty
understanding what it meant, you should specify that in your question,
and say if possible what the difficulty was.

----------------------------------------------------------------------
You ask about the residue (p.212) of a function at a removable
singularity.

At a removable singularity the residue is necessarily 0.  By the
definition of removable singularity (p.201) the function has an
expansion with no negative-exponent terms.  Since the residue means
the coefficient of the exponent-"-1" term, it is zero at such a point.

----------------------------------------------------------------------
You ask about the first sentence of the proof of Theorem 12.1, p.213.

The fact that  S  is open is just used to conclude that it contains
circles  S_r  and their interiors if their radii  epsilon_r  are taken
sufficiently small.  The fact that by taking  epsilon_r  small one can
also guarantee that  S_r  has no singularity other than  z_r  holds for
a different reason:  because  z_r  is an isolated singularity.

----------------------------------------------------------------------
You ask whether the result of Chapter 8 Exercise 7, in which the
a_r  can be interpreted as residues of simple poles, is also valid
if the simple poles are replaced by arbitrary poles and the  a_r
by their residues.

Yes!  (Not only for poles, but for arbitrary isolated singularities.)
And, as you observe, this gives a generalization of the form of Cauchy's
Residue Theorem on p.213 to one that allows paths which may have winding
number other than 1 about points inside them.

I don't know quite why the authors chose to state Cauchy's Residue
Theorem only for "simple loops".  Probably because they wanted to
give it as a tool for integration problems, and in such applications,
the curves used are almost always simple loops; so they felt it would be
best to give the result without the "winding number" coefficient that
would make it harder to memorize without adding to its usefulness.
But conceptually, as distinct from practically, the general form is
more satisfying.

----------------------------------------------------------------------
Regarding the computation in the middle of p.215 you ask

> Why do the authors introduce the term T_j(z)?  Is it so that we can
> see that Q_j(z) has an anti-derivative?

Right.

> This implies that Q_j(z) is differentiable, right?

It implies that it is differentiable on the set where  T_j(z) is
defined.

> But I am troubled because if I look at the original formula for
> Q_j(z), its derivative does not appear to be defined at z=z_j.

That's true.  But I think you are confusing two facts.  One is that
if a function is differentiable on a closed path and everywhere inside
it, then its integral over that path is 0.  This is Cauchy's Theorem;
but we can't make use of it here because, as you observe,  Q_j  is not
differentiable at z_j.  The other is Theorem 6.11, a consequence of
the Fundamental Theorem of Calculus, saying that if a function has an
antiderivative on a closed path (whether or not it has one at all points
inside it), then its integral over the path is zero.  It is to make use
of this fact that the authors introduce the  T_j.

----------------------------------------------------------------------
You ask about a possible result combining the ideas of Lemmas 12.2 and
12.3 (p.216), that would give a formula for the residue of  p(z)/q(z)
at a point where  p  and  q  have zeroes of orders  a  and  b,  in
terms of derivatives of those functions.

For any  a  and  b  one can work out a formula, since the derivatives
of  p  and  q  determine the coefficients of their Taylor series; but
those two lemmas give cases where the formula is particularly simple;
in other cases, it will be more complicated, involving sums of products
of various derivatives.  E.g., even taking  p(z)  to be the constant
function  1,  and letting  q(z)  have a zero of order  3,  if you work
out the beginning of the power series expansion of  p(z)/q(z) =
1 / (a_3 (z-z_0)^3 + a_4 (z-z_0)^4 + a_5 (z-z_0)^5 + ... )  (by "long
division of power series"), and find the coefficient of  (z-z_0)^-1,
you will see that it is more complicated than the sort of formula you
suggested.

----------------------------------------------------------------------
Regarding the line after the display just above the middle of p.217,
you ask "How does the "which --> 3*2 = 6 as z -->1. So res(f,1) = 6"
follow from the previous equation?"

In the right-hand side of the previous equation,  6/2!  gives 3,
while  z+1 -> 2  as z->1.  So the product approaches 6.  Moreover,
the left-hand side is  res(f,1)  by Lemma 12.3, using the fact that
the order  m  of the pole of the function given here is  3.

----------------------------------------------------------------------
You ask about the equation  (1 - z^2/6 + ...)^-1 = 1 + z^2/6 + ...
just below the middle of p.217.

The way to get such facts is discussed on the sheet containing the
homework for March 31.

----------------------------------------------------------------------
You ask how they get the final display on p.218.

By Cauchy's Residue Theorem!  That's the tool that the Chapter is
teaching you and showing you how to use, so they presume that by
this page, when you see such an integral you will automatically
use that theorem to evaluate it.

----------------------------------------------------------------------
You ask how the book gets the third display on p.219, i.e., how they
expand  e^(cos t + i sin t).

I hope that when I went through it in class I made it clear.  Briefly,
apply the first display on p.84, and then expand the factor with the
exponent  i sin t  using display (11) on p.85.

Those formulas are important ones to know!

----------------------------------------------------------------------
You write that you don't see how the limits (8) and (9) on p.219
can be different.

They don't say that they can have different values; on the contrary
if the limit (8) exists, then the limit (9) exists and is equal to
it.  (They say this on the third from last line of the page.)  The
difference is more subtle:  It may happen that (8) does not exist,
but (9) does.  I hope that the discussion of this point that I gave
in class yesterday (and that I also gave when I previewed the material
on the day before Semester break) made clear how this can happen.  If
not, come to office hours and I will try to help further.

----------------------------------------------------------------------
You ask how we can recognize whether the conditions of (II) (p.219)
or (III) (p.222) apply to a given function.

It isn't possible to give a test that will work in all cases.  But
clearly, if we have a rational function  p(z)/q(z)  where p and q
are polynomials, then if  deg(q) - deg(p) >_ 2,  the condition of
(II) will hold, while if that difference is  >_ 1,  the condition
of (III) will.  If we have a rational function multiplied buy some
other sort of function, we should look at how that other function
behaves for large  z ... .

----------------------------------------------------------------------
You ask why condition (ii) on p.220 is stated in the sloppy way it
is, that would logically allow  A  to be a function of  R,  though
the conclusion is false if that is allowed.

It's hard to be sure.  The answer might be "The mistake slipped past
the authors" or it might be "They took for granted that the reader would
understand what the order of quantifiers had to be for the statement
to be a reasonable one", or it might be somewhere between these.

Writing a book is very hard work.  I've been working on one for
about 35 years (https://math.berkeley.edu/~gbergman/245), and
every time I teach from it I still find things that need to be
clarified or corrected.  When the authors were writing a given
paragraph, they were not only dealing with the words that we now see
there, but also with lots of other things that they were choosing
between saying and not saying, and lots of poor formulations that they
fixed and we aren't seeing.  The fact that some badly worded statements
got through is not surprising.  Ian Stewart is a prolific author, who
has written many good books both within mathematics and popularizing
math for the layperson.  (Also some science fiction.)  And when you
have a lot of books, you have less time to spend correcting mistakes
in each of them, unfortunately.

Or, as I said, they may have relied on the reader's interpreting their
phrase "correctly".  When we speak colloquially, we don't always make
the order of quantification of our statements unambiguous, but rely
on the hearer to see what interpretation makes sense in the given
situation.  In writing mathematics, I believe in being more precise,
but they may have drawn the line more loosely than I would.  The
conversational style of the book is in most places a good feature; this
may be an exception.

----------------------------------------------------------------------
You ask about the statement on p.222 before the last integral that
"all we know is that  f behaves like  1/x  for large  x".

The authors are talking about how fast  f(x)  goes to zero as  x
approaches infinity.  The assumption  "|f(z)| _< A/R  for  |z|=R"
at the middle of the page, applied with a real value  x  in place
of  z,  tells use that  f(x)  goes to zero as the inverse first
power of  |x|  (or faster, since the given relation says "_<" and
not "=").  This is what the authors mean.  They are not trying to
make a precise statement here, but merely to contrast the situation
with the one on p.220 by pointing to the significant difference:
the  R^2  on the denominator on that page has been replaced by an  R
on this page.  Going to zero as the inverse first power of  |x|  is
what the function  1/x  does, and the integral of  1/x  (from some
arbitrary positive value to +infinity) does _not_ converge, making the
hypothesis here seem unlikely to lead to convergence.  Yet, as they
will show, it does.

----------------------------------------------------------------------
I hope that what I said in class clarified the point you ask about:
The concept of "Cauchy Principal Value" used on p.224 is different
from (though analogous to) that on p.219.  It is defined in the
first display of heading (IV) on this page.

----------------------------------------------------------------------
You ask, concerning p.224

> How is the Cauchy principal value of the integral from -1 to 1 of
> (1/x) dx equal to 0? (the last equation before figure 12.6)

Did you try to compute that Cauchy principal value from the definition
(first display below middle of page)?  As we have repeatedly seen, the
authors of this book take it for granted that if a calculation is
straightforward, they can simply state the result, and leave it to
the student to work out the details.

If you have difficulty doing the calculation, let me know.

----------------------------------------------------------------------
You ask, regarding p.225 line 4,

> When the authors say to sum over only the the non real poles in the
> upper half plane, what exactly do we get? Display 8 on page 219 or 9?

They're outlining a general approach, which the reader can apply as
appropriate to the given problem.  So one can get either, depending on
what one finds convenient in a particular problem.

----------------------------------------------------------------------
I hope what I said in class clarified the "_< M" in the last full line
of text on p.225.

Looking more closely at the second line of the display after that line,
which you also asked about, I see that the authors have produced a
somewhat illogical expression:  They are finding the precise value
of the first summand, while giving a bound on the second.  There is a
commonly used notation for such a situation, in which one writes
"O(x)" for any function that is bounded by a scalar multiple of  x;
and in that notation they could have replaced the second summand by
"lim_{epsilon -> 0} O(epsilon)", which is obviously 0 so they could
dropped it (as they do) in the next line.  But since they cannot
assume the general reader is familiar with "big-O" notation, as it is
called, and it is not important enough to the course to define, they
should have handled this calculation by separating the two parts, and
showing that the limit of the first integral is  - i pi,  while the
absolute value of the second integral is  _< M pi epsilon  so that it
has limit  0,  so that the whole expression has limit  - i pi + 0 =
- i pi.
----------------------------------------------------------------------
You ask what the calculation of pp.225-226 concludes about the value
of the integral from  -infinity  to  infinity  of  e^(ix) / x,  given
that the real part can be computed only as a principal value.

Putting the results of their calculations together, we can say that
we only have a principal value for that integral, namely  i pi,  as
shown at the bottom of p.225.

But the payoff is that in trying to compute that integral, which turned
out to exist only in a weak sense, they got the integral of  (sin x)/x,
a genuine integral (p.226, top) !

----------------------------------------------------------------------
You ask about the possible values for "a" in (V), p.226.

As I noted in class,  a  cannot be an integer.  It can be any
other complex number.  But on the next page, on the third line of text,
where it says "If  phi  is such that," they require a condition relating
a  and  phi;  so our choice of  a  is actually restricted to values
for which that condition holds.  As I said in class, the next homework
assignment will include a (probably unassigned) exercise that looks at
the relation between  a  and  phi  needed to make that display hold.

You also ask whether, for  a = i,  one could use the method of (III).

I hope what I said in class answered that:  if  phi  has any poles at
nonzero points, then  phi(e^z)  has infinitely many poles, so the
method of (III) is not applicable.

----------------------------------------------------------------------
You ask about the first display on p.227.

As I indicated in class, the key point is that the extra  2 pi i
doesn't make a difference in  f(e^z)  (though it does in  e^(az)).
As to where the  (-1)  comes from, this is because  dz  gives  -dt.

----------------------------------------------------------------------
Regarding the next-to-last display of (V) on p.227, you ask how
they calculate that the residues of the function at the two
singularities shown are  (-(1/2)e^(i pi a/2)/(1-e^(2 pi ia)  and
(-1(/2)e^(i 3 pi a/2)/(1-e^(2 pi ia)  respectively.

By Lemma 12.2 (second sentence).

----------------------------------------------------------------------
You ask regarding p.228:

>     at the beginning of section 4  they state that f is a function
> which is differentiable at z=n at for any integer n. Am I correct in
> assuming that they have simply neglected to mention that f must also
> be well-defined for the complex plane (or possess only isolated
> singularities) ... ?

Right!

----------------------------------------------------------------------
You ask regarding p.228, second line of section 4, why  z = n
is a simple pole and not some other kind of singularity.

Because the functions have the forms  cos pi z / sin pi z  and
1 / sin pi z,  where in each case  sin pi z  has a simple zero
at  z = n,  while the numerator has no zero or pole.

----------------------------------------------------------------------
You ask about the statement at the bottom of p.228 that the functions
cosec(pi z)  and  cot(pi z)  are bounded on  C_N  "where the bound
is independent of  N".

As you suggest, it means that the same bound works for any  N.  But
this does not mean that they are bounded on the whole plane, because
"any  N" means "any integer  N".  Because  N  is an integer,  C_N
slips through half way between the poles of  cosec(pi z)  and
cot(pi z),  and one can find a bound that is valid on these half-way
routes.  But the poles make the functions unbounded on the plane.

----------------------------------------------------------------------
You ask why on the first line of p.229, in studying the horizontal
sides of the square, the authors note that  |y| >_ 1/2  without
mentioning the "+-(N + 1/2)", while later on the page, in studying the
vertical sides, they specify that  x = +-(N + 1/2).

In each case, they are pointing out what is needed to prove the desired
inequalities.  The functions  cosec  and  cot  have the properties
that when one is at a distance of at least  pi/2  from the real axis,
they have absolute values less than certain constants, whether or not
that distance has the form  pi times (integer + 1/2);  so all that the
authors have to use about the horizontal sides of the square is that
|y| >_ 1/2.  But on the real axis,  cosec  and  cot  have poles at
all multiples of  pi,  so we have to keep track of where our edges cross
the real axis relative to those poles.  The fact that they cross it
half-way between them (because when  N  is an integer,  pi (N + 1/2)
is halfway between  pi N  and  pi (N+1))  allows the authors to prove
bounds on the  cot  and  cosec  functions, independent of  N.

----------------------------------------------------------------------
You ask about the first inequality "_<" in the middle display on p.229.

They have in the numerator something which the numerator on the
preceding line is  _<  to, and in the denominator something which
the denominator on the preceding line is  >_  to.  This implies the
indicated inequality of quotients.

The inequality of numerators is a straightforward case of the
triangle inequality; the inequality of denominators is a "reverse"
application of the triangle inequality; i.e., one applies that
inequality to the summands  e^{i pi z} - e^{-i pi z}  and  e^{-i pi z},
and then moves the absolute value of the latter term to the other side.

----------------------------------------------------------------------
Regarding p.230 you ask:

>In the center display, where they use the Estimation Lemma, how do they
>get the length of the contour to be 8N+4?  I think my question comes
>from the fact that I don't really understand how they parametrized the
>square on p.228.  On p.228, in the last display, is (N + 1/2) being
>multiplied by all 4 combinations of +/- 1 and +/- i?  And is the
>capital N the same as the lower-case n which was mentioned a few lines
>before?

As noted in class, all four combinations are intended.  Since each
side of the square extends  N + 1/2  distance in each direction from
0,  it has length  2N+1,  so the perimeter is  8N+4.

The relation between  n  and  N  is that when one sums the residues
in a given square, the index  n  goes from  -N  to  N.

----------------------------------------------------------------------
You ask how the Theorem 12.4, p.231, saying that the expression shown
equals the number of zeroes minus the number of poles, can be correct
when the proof shows that it equals the sum of the orders of the zeroes
minus the sum of the orders of the poles.

This is what is meant by the phrase "each counted according to
multiplicity" at the end of the statement of the theorem.  It means
that a double zero is counted like two zeroes, a triple zero like
three zeroes, etc., and the same for poles.

----------------------------------------------------------------------
You ask, regarding the definition of "arc" on p.240, whether "arg"
denotes the principal value.

Different values of  arg  differ by multiples of  2 pi,  so they
all give the same value of "arc".  So  arc(z)  is the common
equivalence class to which _all_ the numbers that can be called
"arg(z)" belong.

----------------------------------------------------------------------
You ask about the sentence at the end of p.241, "Then if  f  is
differentiable ... we have [equation]".

I gave a correction to that in class on Friday, saying that you
should read instead, "Let  f  be differentiable ... and let us
write [equation]."

Remember that if you are going to miss class, you should have someone
taking notes for you!

----------------------------------------------------------------------
You ask whether on p.242, the sets  S  are subject to any restriction.

Yes.  On the previous page, in the first line of section 2, they
say "where  S  ... is a domain".

----------------------------------------------------------------------
You ask about the relation between the conditions that  gamma' = 0
and that  f' = 0  in the discussion of conformality on p.242.

As you say, neither of them implies the other -- indeed, they are
conditions on different objects!  The authors assume  gamma' = 0
so that the path  gamma  will have a well-defined direction, and so
will be a suitable object for testing what  f  does to directions of
paths.  On the other hand,  f'  must actually be nonzero for  f  to
preserve angles, as noted in the last paragraph of this page,
and discussed in more detail in class.

----------------------------------------------------------------------
You ask, regarding the discussion on p.242, whether "function",
"mapping" and "transformation" mean the same thing.

Generally speaking, yes.  The biggest difference is in point of
view.  One uses all three in cases where one pictures the domain and
codomain separately, and thinks of points as being "carried" from
one to the other:

        ---------       ---------
        |  .    |       |        |
        |  .    | --->  |.    .  |
        |    .  |       |     .  |
        ---------       ---------

while if one has a simpler sort of picture:


       | /\
       |/  \
       |    \/\
       |       \/
       |----------

one tends mostly to say "function".  I don't know why the authors
make such a fuss on this page about which to use!

----------------------------------------------------------------------
You ask whether Moebius mappings (p.246) are conformal, saying that
since they are not everywhere differentiable, they are not.

But they are differentiable where they are defined.  The book is
sloppy in calling them maps  C --> C,  but when one recognizes that
their domains are of the form  C - {-d/c}  (or as I did in class,
regards them as functions from the extended complex plane to itself),
one finds that they are everywhere differentiable.  So yes, they are
conformal.

----------------------------------------------------------------------
You ask what a Moebius mapping (p.246) will do to a rectangle.

Each of the line segments making up the rectangle will be sent to an arc
of a circle or to a line segment; so the rectangle will go to a contour
consisting, in general, of four such arcs, each meeting the next
at right angles.  (If the point that the Moebius mapping sends to
infinity lies _on_ the rectangle, things become a little more
complicated; you might try thinking about that case yourself.)

----------------------------------------------------------------------
Your question of the day came after I had left for the classroom.
Yes, it would have been good if I'd spent a few minutes drawing a
picture illustrating the fact that those points whose distances from
p and q are in the ratio k lie on a circle (p.246).  But you should
also find it interesting to sketch such a picture for yourself, say
taking k = 2  for concreteness.

----------------------------------------------------------------------
I trust that my lecture answered your question about the first display
on p.247.  In particular, the  k  of that equation is the  k  of (6)
on the preceding page.

----------------------------------------------------------------------
You ask about the structures of the set of rigid motions in Euclidean
space, and of the set of Moebius mappings of the (extended) complex
plane, both referred to at the bottom of p.247.

They don't have "bases" as vector spaces because they aren't vector
spaces -- the sum of two Moebius mappings (with different denominators)
is not a Moebius mapping, and likewise the sum of two rigid motions
is not usually a rigid motion.

However, you are onto a very important idea.  They do have dimensions
in the sense that around each point, there is a fixed number of
independent "directions" in which one can vary the map.  (For rigid
motions in n-dimensional real space, this number is  n(n+1)/2, so one
can say the set of these has real dimension  n(n+1)/2.  The set of
Moebius mappings has complex dimension 3, or real dimension 6.  These
numbers correspond to the concept of "number of degrees of freedom"
in the natural sciences.)

A set with a geometric structure such that one can talk about
dimension in this way is called a manifold; these are studied in
Math 141 and Math 214.  In the cases of rigid motions and Moebius
transformations, the sets also form groups, and groups with compatible
manifold structures form an important concept called a Lie group,
studied in Math 261AB.

----------------------------------------------------------------------
You ask why _two_ translations are needed in the construction of the
general Moebius transformation in Theorem 13.2 (p.248).

The various sorts of transformations listed do not commute with
each other -- you can check that if you perform an inversion and then
a translation, the result will be different from that of performing
the translation first and then the inversion; and similarly for several
other combinations of operations.  Hence, the translation performed at
the end of the construction does not have the same effect as if one
had performed it at the beginning.  So by performing both one at the
beginning and one at the end, one can get transformations that one
couldn't get by just performing a translation in one place or the other.

----------------------------------------------------------------------
You ask about the phrase "they map circles (or straight lines)
into circles (or straight lines)" on p.246.

It means that the images of circles will be circles and straight
lines, and the images of stright lines will be circles and straight
lines.  More precisely, one finds that if  c  is nonzero, then  f
maps every circle passing through  -d/c  to a straight line, and also
maps every line through  -d/c  to a straight line, while it maps every
circle not passing through  -d/c  to a circle, and every line not
passing through  -d/c  to a circle.

----------------------------------------------------------------------
In connection with the reference to inversive geometry on p.248,
you ask what it is, and why it is "out of fashion" now.

Inversive geometry is the study of properties of configurations of
the plane (or a higher dimensional space) that are preserved
not only under the "motions" allowed by Euclid -- combinations
of translations, rotations and reflections -- but also inversions.
For instance, the property of being a straight line is not of this
sort, while the property of belonging to the set {lines}\union {circles}
is.  Likewise, the property of two curves having equal lengths is
not, but the property of meeting at a given angle is.

As to why things go in and out of fashion -- to a large extent, this
is a mystery.  When I was an undergraduate here (1959-1963), there were
20 or so of us who enjoyed folk-dancing (the dances of the Balkans,
Israel, Eastern Europe, with a few Western European and American
dances thrown in), and danced regularly on campus a couple of times
a week.  When I came back as a faculty member in 1967, hundreds of
students and others were doing it.  Since then, it has completely
collapsed -- there are still a few groups, but they aren't on campus,
and almost everyone who attends is in their 50's or 60's or 70's or
80's.  (I'm 60.)  Why did such a fun activity so completely lose its
appeal?  I wish I knew.

With respect to inversive geometry, I can guess part of the reason.
In modern mathematics, geometry is mainly concerned with properties
preserved under general continuous, or continuous and differentiable
transformations, so "inversive geometry" is just a very special case,
and is considered of less interest than the general case.  A couple of
the traditional, more restricted sorts of geometry, in particular,
Euclidean geometry and projective geometry, are sufficiently important
that they continue to be learned, at least in elementary form -- the
first because of its connection with the real world, the second because
of its connection with linear algebra and with other sorts of
geometry.  And one encounters the "classical nonEuclidean geometries"
because of their importance in intellectual history, having taught
the world that axioms should be looked at, not as things that are
"obviously true", but as assumptions whose consequences one studies,
and seeks to find models for if these are not evident.  But inversive
geometry doesn't have any of these advantages, at least not in a very
strong way; so some of us learn a little bit of it now and then, and
probably a few people here and there study it deeply, but it isn't
a hot topic.  Still, if some use for it comes up, people will open the
old books and re-learn the theorems that were proved a hundred years
ago ... .

----------------------------------------------------------------------
You ask whether the conclusion that  u  and  v  satisfy Laplace's
equation (p.250) is true "for all cases of  u  and  v  as long as
they are differentiable".

It is true for all real functions  u  and  v  such that the complex
function  u + iv  is differentiable!  (So, for instance, it is not
true for  u = x^2,  v = y^2,  since even though each of these is
differentiable as a function of two real variables  x and  y,  i.e.,
in the sense of Math 53, the function  x^2 + iy^2  is not differentiable
in the sense of this course.)

----------------------------------------------------------------------
You ask how they get the equations on p.250 for  u(x,y)  and  v(x,y),
where  f(z) = z e^z .

When there is a straightforward computation, they often leave it to
the reader -- which is a good practice.  They have made the convention
that  z = x+iy  and  w = f(z) = u+iv.  So you should substitute  x+iy
for  z  in the function  w = z e^z,  use the formulas for the real
and imaginary parts of  e^z  that were obtained when we studied the
exponential function (Chapter 5), multiply out, and call the real and
imaginary parts of the resulting expression "u" and "v".

If you have difficulty doing so, or get a result different from that in
the book, then show me your computations and I will show you how to
continue where you got stuck, or what was wrong if you got a different
answer.  But when the book shows you the result of a computation
without explanation, you should always try the computation yourself,
reviewing points that are needed for it if you don't remember them; and
then, if something goes wrong, ask specifically about the point where
the problem came up.

----------------------------------------------------------------------
Concerning the example at the top of p.251, with  u = x^2 - y^2
and  v = 2xy,  you write "if  v  is the stream line then  u  and  v
should be orthogonal, but that doesn't seem the case".

I'm not sure whether, when you wrote "if  v  is the stream line", this
was short for the correct statement, "if the curves  v = constant  are
the stream lines", and likewise whether when you wrote "u  and  v
should be orthogonal" you meant "the curves  u = constant  and  v =
constant should be orthogonal".  If so, show me your computations
in which the curves  u = constant  and  v = constant  didn't come out
orthogonal, and I'll see what I can do to find the mistake.

(Whenever you write saying that something doesn't seem to work out
right, it's best to show your computations, at least in outline, so
that I know what to respond to.)

----------------------------------------------------------------------
You ask why we can't compare the power series around  z=0  and the power
series around  z=2  directly, without using the power series around  z=i
as is done on p.258.

One cannot speak of two power series with centers  0  and  2  as being
"the same" series -- one is the sum of a series in powers of  z-0  and
the other the sum of a series in powers of  z-2,  and these are not
the same thing.  So the best one can do is talk about two series
with different centers defining the same _function_ on the set where
their discs of convergence overlap.  If they do, and if that set is
nonempty, then each of the power series uniquely determines the other,
by the reasoning using the Identity Lemma that I gave in class today.
But if the discs of convergence do not overlap, then there is no way
even to say they define the same function; and working via one or more
intermediate discs (or other sorts of regions, as will be mentioned in
the next reading) is the best we can do.

----------------------------------------------------------------------
I hope that what I said in class answered your question as to why
the radius of convergence would be precisely the distance to the
nearest (nonremovable) singularity (p.258).  The precise result
called on to show the radius is at least that value is Lemma 10.2
(or Theorem 10.3).

On the relation to the extra credit problem:  That concerns a case
where we do not know that there is any way to define the function
differentiably on points outside the disc, and the core of that problem
is how we would get such a _global_ extension if there were no points at
which the function was not _locally_ extendable.

----------------------------------------------------------------------
You ask how to verify that no single choice of  z_0  will give a
power series expansion valid on all of  C \ {+- 1}  (p.259).

I hope that what I said in class clarified this.  Briefly, any
power series converges on a disc, possibly the "infinite disc"
given by the whole complex plane.  If one has a power series with
a finite radius of convergence, it would certainly not converge
on all of  C \ {+- 1},  while a series with infinite radius
of convergence would converge to a function continuous at  +-1
as well, though  1/(1-z^2)  approaches infinity as one approaches
those points, so it couldn't give that function.

----------------------------------------------------------------------
You point out in connection with the discussion on pp.259-260 that
three overlapping discs that surround the origin could lead to
trouble if we try to extend a function like  log z.

Right!

In the context of today's reading, this is not a problem, because
we are _assuming_ the function  f(z)  is differentiable on the
domain we are interested in, and our focus is just on how to
determine the values of the function in one region from its values
in another.  The logarithm function doesn't satisfy our assumption
because we can't choose it in a way that is differentiable in a
region that surrounds  0.

But in later sections, we will consider a function which we
assume given on one region, and which we want to try to extend
in all possible ways to other regions; and precisely the difficulty
you point to will come up.  We will respond to it with a dramatic
change of approach!

----------------------------------------------------------------------
You ask about the "at most 3 discs" comment in the next-to-last line
on p.260.

I'm not sure exactly what the authors mean because when they refer to
the number of discs in the chain, they don't say whether they are
counting  S_1,  nor whether they require that the last disc in the chain
have  z_0  as its center or just somewhere inside it.  But let me give
you a rough sketch of a proof of a precise statement of the same sort:
that given any two points  z_0  and  z_1  in the domain of  1/(1-z^2),
one can find two overlapping discs,  D_0  containing  z_0  and  D_1
containing z_1  in that domain.  This gives a chain of two discs
"connecting" those points.

If  z_0  and  z_1  both lie above the x-axis, or both lie below it,
then one can find a single disc containing them both.  Namely, if
they are both above, consider the disk centered at  C i  for a very
large real number  C,  and tangent to the x-axis at x=0.  As one
makes C  larger and larger, this disc looks (at any given scale of
observation) more and more like the whole upper half plane.  Therefore,
given any two points in the upper half-plane, it eventually contains
both.  If both points lie below the x-axis, use the same idea with
negative  C.

If  z_0  lies above the x-axis and  z_1  below it, take any real
number  z_3 > 1.  Then I claim there is a disc  D_0  in our desired
domain which contains  z_0  and  z_3,  and a disc  D_1  which contains
z_1  and  z_3.  Namely, draw the line-segment from  z_0  to  z_1,
take its perpendicular bisector, extend it by a large distance  C  in
the direction away from the x-axis, and take  D_1  to be centered at
the resulting point on this line, and of radius just a little larger
than the distance from that point to  z_0  and  z_3.  If  C  is large
enough and if the radius exceeds the value that would put  z_0  and  z_3
on its boundary by a small enough amount, then the resulting disc
will not contain  +-1,  hence lie in the domain of  1/(1-z^2).  We
construct  D_2 analogously.  Since both discs contain  z_3,  they
overlap, again giving the desired length-2 chain.

The remaining cases, where  z_0  and/or  z_1  lie on the x-axis,
are similar.  For one example, suppose both of them do.  Then we
construct very large discs centered directly above these two points,
one of which dips just a little below the x-axis near z_0, and so
contains it, and the other of which likewise dips just a little below
the x-axis near z_1.  If we take the centers of these discs high
enough, they will have large enough radii so that they have to overlap.

If you find it interesting, you might enjoy trying to turn these
hand-waving ideas into a precise proof (trying to minimize, as you go,
the number of different cases that one has to consider).  If not,
don't worry; it's not essential material for this course.

----------------------------------------------------------------------
You ask how, in the situation discussed on pp.260-161, one constructs
the power series on the intermediate discs.

As I said in class on Friday, the actual _computation_ of the terms
of the power series is something outside the scope of this course.
The point being made is that one series will uniquely determine the
other because of the Identity Theorem, as I discussed in class today.
All this is really just a way of motivating the concepts to be used
in the rest of this chapter; as the book says on p.261, just above
the figure, a "psychological springboard".

----------------------------------------------------------------------
You ask about the statement "the restriction to power series and
discs is inessential" on p.261.   What they mean is that the idea of
extending a function from one region to another to another, which they
are considering here when the regions are discs, will also make sense
for regions that aren't discs.  So the fact that they are considering
discs (and functions defined on those discs by power series) here is
not essential to the scope of the idea.  The clue that something like
this is what they mean is the phrase "(as in the next section)".

----------------------------------------------------------------------
Regarding the example on p.262 you write

> Prior to this chapter we have usually assumed that functions had no
> non-isolated singularities, clearly a function with a natural boundary
> has non-isolated singularities. Are we making any assumptions about
> our functions now other then that they are analytic in some domain?

If a function is analytic in a domain, then any singularity is a point
outside that domain.  When we talk about an isolated singularity, we
are really talking about a point which is outside the domain of the
function, but such that a punctured neighborhood of that point is in
the domain of the function.  So restricting attention to isolated
singularities simply meant that when we considered properties of the
functions as we approached points outside their domain, we only looked
at the case of points that were surrounded by punctured discs in
the domain.

On p.262, as you say, we do look at the behavior as we approach other
sorts of points outside the domain of the function.  But this does
not mean we are weakening our assumption on the function.

It is also true that the authors sometimes speak of a function having
a singularity "in" a domain  D;  in that case  D  is not really the
domain of the function; the domain is  D  with the points where the
singularities occur deleted.  But that is just shorthand language, not
a change in our assumptions.

And finally, it is true that all the _examples_ we have seen until
this page could be analytically extended to everywhere in  C  except
for a countably set of singularities.  But we never assumed that in
proving our results; so again this is not a change in our assumptions,
but just an enlargement of the set of examples we know about.

----------------------------------------------------------------------
You ask how we calculate the radius of convergence of the series
Sigma z^(n!)  on p.262 to be  1.

Well, first, here's the "mechanical" way of doing it, via the formula
(lim sup |a_n|^(1/n))^-1 :

To apply that formula, we need to express the series in the form
Sigma a_n z^n.  Looking at what the coefficients of all nonnegative
integer powers of  z,  are, we see that  a_n  is:

		0  if  n  is not the factorial of an integer,
		1  if  n  is the factorial of an integer  > 1,
		2  if  n = 1  (coming from the two terms z^(0!)+z^(1!))

Taking the nth root for all n > 0,  we get the same values.  Hence
the lim sup, the supremum of the subsequential limit points, is  1,
so  1^-1 = 1  is the radius of convergence.

To see this more quickly, note that for  |z| < 1,  the comparison
test with  Sigma z^n  (comparing the nth term of the given series
with the  n!'th  term of the geometric series) shows that the sum
converges, while for  |z| = 1, there are infinitely many summands
of absolute value  1,  so the summands don't approach 0, so the
series diverges; so the radius of convergence has to be  1.

----------------------------------------------------------------------
You ask regarding the next to last display on p.262,

> How does one find that the summation of r^n! from n=q to q+N
> approaches N+1?

First you have to ask yourself "approachs  N+1  as what approaches
what?"  The answer is given on the next line:  "as  r --> 1".  Since we
are talking about a sum of finitely many (namely, N+1) functions of
r,  we find its limit as  r --> 1  by taking the limit of each of
the summands and adding.  Let me know if you have any difficulty with
that step.

----------------------------------------------------------------------
You ask how the last display on p.262 is compatible with the statement
that for  |z| < 1  the series converges.

The key is in the quantification of that statement.  They show that for
all  N  there exists an  epsilon  such that for  r\in(1-epsilon,1)
the stated inequality involving  N  holds.  This is different from
saying that there exists an epsilon such that for all  N  the inequality
involving  N  holds for all  r\in(1-epsilon,1) .  If the latter were
true, then one could indeed conclude that for that value of epsilon,
every  r\in(1-epsilon,1)  would give a point where the series diverged.
With what they prove, one can only conclude that the closer  r  gets to
1,  the bigger values we get.

----------------------------------------------------------------------
Regarding the paragraph on p.265, before section 14.4, you ask
"So is a multiform function f a "normal" function that given x
there is only one f(x) corresponding to f ?"

No.  A multiform function, as they define it, is not a function;
it is a collection of infinitely many functions, on infinitely many
different domains.  Each of these infinitely many functions is a
"normal" function with just one value at each point, but two
different functions in the family can have different values
at the same point.  (For instance, the "complete analytic"
version of the square root function includes one pair  (f_1,S_1)
where  S_1  is the right half-plane and  f_1(4) = 2,  and another
pair  (f_2,S_2)  where  S_2  is the right half-plane and  f_2(4) = -2.
Each of  f_1  and  f_2  is a single-valued function.)

It is not till we get to the concept of Riemann surface that we have
a way of turning a multiform function into _one_ normal function;
namely a function on the Riemann surface.  The Riemann surface of
the square root function has two points corresponding to the complex
number  4.  At one of these points, the square root function has
the value  +2;  at the other, it has the value  -2.

----------------------------------------------------------------------
Regarding the definition of the complete analytic continuation of a
function on p.265, you ask

> ... can (f_1,S_1) and (f_2,S_2) be elements in the complete analytic
> function F if the intersection of S_1 and S_2 is only 1 point z and
> f_1(z) is not equal to f_2(z)? ...

The intersection of  S_1  and  S_2  will be an open set, so it can't
consist of just one point.  If it is not empty, it contains uncountably
many points (namely, if  z  is a point of this intersection, it contains
a disc  N_epsilon (z)  for some epsilon).

As for whether two elements (f_1,S_1) and (f_2,S_2) can belong to  the
same complete analytic function even if they disagree on the
intersection of S_1 and S_2 -- yes, as long as they _can_be_connected_
by a chain of functions elements with intersecting domains, each of
which _does_ agree with the _next_ one on the intersection of their
domains.  That is what the definition of analytic continuation says.

----------------------------------------------------------------------
You ask about the notation "z\in S_3 = H_3, im(z) >_ 0" in the third
line of the big display on p.266.

Here "H_3 = S_3, im(z)>_0" is not a description of a set to which  z  is
required to belong.  Rather, the comma separates two conditions on  z,
namely "z\in S_3" (which, as is noted, = H_3),  and "im(z) >_ 0".

----------------------------------------------------------------------
You ask about the note on p.266, "(where  r+4  is to be interpreted
mod 8, of course)".

Since you haven't had Math 113, I'm not sure whether you have been
exposed to "arithmetic mod n" in any form.  What they mean is
illustrated by taking  r=6.  Then the statement "f_r(z)  will
take one of the two values and  f_(r+4)(z)  the other" becomes
"f_6(z)  will take one of the two values and  f_2(z)  the other",
because when  r=6,  the subscript  r+4  becomes  10,  and the value
in {1,...,8}  that 10 is congruent to mod 8 is  2.

Have you seen arithmetic mod n in some form?  Sufficiently to make
sense of the above explanation?

If not, then replace the "mod 8" comment by a proviso that if
r+4  is >8,  we use  r+4-8 = r-4  instead of  r+4.

----------------------------------------------------------------------
Regarding the discussion of multiform function on pp.265-267 you write:

> ... I can't help but wonder: Is there any multiform function that
> doesn't involve the 'log' function or a 'root' (as in squre root)
> function?

Yes.

The easiest "examples" are inverse trigonometric functions.  However,
using the relations  cos z = (e^iz + e^-iz /2)  etc., one can in fact
express these in terms of logarithms, so they are not really what you
are asking for.

A class of examples that are genuinely not obtained from roots and
logarithms arise in the same way that those two classes of examples
do: by taking inverse functions of non-one-to-one functions.  For
example, solutions to  z = e^w + w  give a multiform function  w = f(z).
More generally, the set of solutions to an equation  F(z,w) = 0  where
F  is differentiable in two variables will give such functions.  These
functions cannot be expressed directly in terms of functions we are
familiar with, so the book doesn't give them as examples.

----------------------------------------------------------------------
You ask whether one has a Laurent expansion of a function  f  about
a branch point (p.267).

No -- if a function has a Laurent expansion about a point, then where
this converges, its sum gives single-valued function; but a branch
point is a point in the neighborhood of which a function can't be made
single-valued.

(One can generalize the concept of Laurent expansion so that some
functions do have such expansion in neighborhoods of branch points; for
instance, one can consider series involving fractional powers  z^(p/q),
But that is outside the scope of this course.)

----------------------------------------------------------------------
You ask what Riemann surfaces (pp.268-271) are used for.

As the book indicates, we need them to study functions like the square
root function or the logarithm function in a decent way.  The approach
to logarithms in Chapter 7 gives us functions that are discontinuous
along some line; the approach of section 14.3 gives a patchwork of
bits of functions.  But regarding the logarithm as a function on a
Riemann surface, we can make it everywhere differentiable, and use the
theory of differentiable functions that we have developed through this
book.  (Once we have extended that theory from functions on  C  to
functions on a Riemann surface.)

In a course like 185 there is not time to include a full study of
Riemann surfaces, so the choices are not to mention them at all, or
to just give the idea, without real applications.  Most 185 texts
make the former choice.  These authors make the latter choice, which
I like, because the concept is an exciting and beautiful one.

----------------------------------------------------------------------
You ask how the authors get display (8) on p.272 from display (7)
on p.271.

Look at the second line of display (7).  The expression
log r + i theta  there is a value of  log z.  (Because, by (6),
z = e raised to that power.)  So the authors aren't saying that
the final expression in (7) clearly equals (8); they are saying
that the approach used to get (7), as seen in the second line,
clearly amounts of making the definition (8).  The final expression
in (7) is just an explicit calculation of what this gives.

----------------------------------------------------------------------
You ask about the restrictions on  d  in the expression  m-n = qd+k,
p.273, line after first display.

It can be any integer.  This is the Division Algorithm (i.e.,
division of integers with remainder), as in Math 113.

----------------------------------------------------------------------
You ask how the authors chose the intervals in the chart on p.276.

Good question.  I hadn't looked closely at it, just assuming "they
do the obvious thing".

Well, first of all, the stretches of  gamma   where they use a given
branch of  z^1/5  can't be as long as  2 pi,  since if they were,
the domains containing them would meet at "both ends".  So we make
those stretches each have length  pi.  But each has to be contained
in a domain, and domains are open, and so don't contain their
boundary points.  So to create a domain containing  gamma(t)  for
all  t  between  0  and  pi,  for instance, they define  D_1  to
consist of all points with argument between  -pi/4  (a value < 0)
and 5pi/4 (a value >pi).  They could have used other values, such
as  -.1 pi  and  1.1 pi;  they just used this as a simple choice.

But I see that they wrote those ranges using square brackets instead
of parentheses!  Another correction to send them.

----------------------------------------------------------------------
You ask whether a multiform function might not have a global
antiderivative on its Riemann surface, and in that case, whether it
might be possible to apply "method 3" (p.277) but not "method 4"
(p.278).

It's certainly possible to have multiform functions without global
antiderivatives -- anything that can go wrong for uniform functions
can also go wrong for multiform functions!  For instance, by adding to
the "well-behaved" multiform function  sqrt z  the "badly behaved"
uniform function  1/z,  we get a "badly behaved" multiform function,
one that does not have a global antiderivative on its Riemann surface.

But in such a case, we can use the Riemann surface of its (multiform)
antiderivative.  For the above example  sqrt z + 1/z,  this is the
same as the Riemann surface of log z.

So I don't see, offhand, any situation where "method 4" couldn't
be applied.  The one prerequisite for any of these methods to be
applicable is that we be able to "follow" the values of the function
along our path.  If the function is given by a formula we can compute
with, and its multivaluedness is expressed by certain choices in the
formula, we can do this.

----------------------------------------------------------------------
You ask whether it matters that in Example 2, p.278, they use the
cut plane  C_0  rather than  C_pi.

If they used  C_pi, the function  f(z)  would be discontinuous along
the negative real axis, since  z^a  would be defined using a branch
of  log z  that is discontinuous there.  Since that axis cuts through
the region around which we are integrating, Cauchy's residue theorem
would not be applicable there.

As the authors confess at the top of p.280, Cauchy's residue theorem
is not really applicable to  C_0,  either, since the line of
discontinuity runs along two segments of the path of integration!
But they discuss several ways of "fixing" this problem.  Method 1
would not work for  C_pi, but does work for  C_0.  The other two
methods involve abandoning the choice of domain  C_0  anyway.

----------------------------------------------------------------------
Regarding the second sentence after the first display on p.279,
you write

> If you assume that p<1 and R>1 the singularities i and -i are not
> necessarily inside the contour ...

Do you remember that the "inside" means the set of points about
which the winding number is nonzero?  If you feel those points are
not inside, or aren't clear how to tell whether they are, come to
office hours so we can get this straight.

----------------------------------------------------------------------
You ask where the third integral in the second display on p.279 comes
from.

It represents integration over the third stretch of the path the
authors are following:  in Fig. 14.17 they start at the point rho
on the real axis, integrate along the interval from rho to R, then
counterclockwise around the big circle, then back from R to rho (this
is the integral you ask about) and then clockwise around the little
circle.  When they do the third integral, they don't use the same
branch of  z^a/(1+z^2)  that they used in the first integral, since
over the path that they have followed, it has moved onto a different
branch, as I discussed in class today.  They indicate this by sticking
an "e^(2 pi i)" in after the  x,  by which they mean to show that the
a-th power is to be taken as  e^(a(Log z + 2 pi i));  though this
notation is not based on any conventions they have used before.

----------------------------------------------------------------------
You ask about the third integral in the big display on p.279.

As I said in class, "x e^(2 pi i)" would logically mean just the
same as  x,  since  e^(2 pi i) = 1.  What the authors intend by this
strange notation is to remind you that  z  has gone 360 degrees
(in other words,  2 pi  radians) around the origin to get to this
point, so that in raising it to the a-th power, we should use values
of  log z  that are real numbers +  2 pi i,  rather than the real
values that we used in the first integral; thus we get the factor
e^(2 pi i a)  in the bottom display on the page.

At the top of the next page, they admit that what they have done
requires justification (though they don't mention that the notation
they have used needs this explanation!), and they talk about how this
justification can be done.

----------------------------------------------------------------------
You ask about the reversal of limits of integration and the reversal
of sign in the last display on p.279, as compared with the third
integral in the long display, on which it is based.

They want to get that third integral into a form that can be compared
with the first integral.  One major difference between the first and
third integrals is the order of the limits of integration, so they
reverse this order, noting that this changes the sign -- i.e., they use
the general fact "integral from b to a = minus integral from a to b".

----------------------------------------------------------------------
You ask about the "nifty footwork" referred to on p.280.

You might not be remembering the correct definition of the domain
C_0:  it is  C \ {nonnegative real axis}.  So in fact the first
and third parts of the path of integration are not in that domain.
When the authors say "Did you see the nifty footwork?" they mean "Did
we succeed in putting one over on you?"  Since they confess to having
done something not justified in the terms they had presented it, and
then discuss how it can be justified, they can be excused for having
done so.  I hope that what I said in class (or what they wrote)
did make clear how their calculation could be justified.

----------------------------------------------------------------------