Homework #6 for Math H110, fall '02
1. Let M,N be two commuting square matrices. Show that for each eigenvalue of M, there is an M-eigenvector (with that eigenvalue) that is also an N-eigenvector (with probably some other eigenvalue). Use this to show that there exists a basis in which M,N are both upper triangular, i.e. they're "simultaneously upper triangularizable".
Obviously I forgot to say "over an algebraically closed field"! Or else even the M=N case is unsolvable.
A. Pick an eigenvalue x of M, and let W be the subspace consisting of M-eigenvectors of eigenvalue x (and also the zero vector). If w is in this space, then so is Nw, since M(Nw) = N(Mw) = N(xw) = x Nw.
Since N preserves W, there must be an N-eigenvector in it. We're done with the first part.
Now repeat the proof we had in the case of one matrix - pass to the quotient space by this line, and do induction.
2. Let M,N be two commuting diagonalizable matrices. Use last week's homework (perhaps) to show that they're simultaneously diagonalizable.
A. Put M in JCF. Then N is block diagonal (by last week's homework). Since N is diagonalizable, it satisfies a square-free polynomial. Therefore each of its blocks does. Therefore they're each diagonalizable, using a block of the same size. Put all these blocks together, and you get a matrix X that commutes with M, and has XNX^{-1} diagonal.
3. Let M = D1 + N1 = D2 + N2, where D1,D2 are diagonalizable, N1,N2 are nilpotent, and everybody commutes with everybody else. Show D1=D2.
A. We have D1-D2 = N2-N1. Since D1 and D2 commute and are each diagonalizable, they're simultaneously diagonalizable, so D1-D2 is diagonalizable.
If N2^m = 0, and N1^n = 0, then (N2-N1)^{m+n} = 0 (we can apply the binomial theorem because N1 and N2 commute). So D1-D2 is nilpotent. But it's also diagonalizable, so it's zero.
4. Use JCF to show that if we feed a matrix into its characteristic polynomial, the result is the zero matrix. (It's very easy to show that it has all zero eigenvalues, but that's not good enough.)
A. The characteristic polynomial is the product of (lambda-e)^(m_e), where e varies over the eigenvalues and m_e the dimension of the corresponding generalized eigenspace. Feeding in the JCF to this polynomial is the same as applying the polynomial to each block.
When we plug in a JCF block with e on the diagonal, its size is at most m_e, so the (lambda-e)^(m_e) factor is the zero matrix. Multiplying out, every block is killed.
5. Say M is in JCF, and p is its characteristic polynomial. If for each polynomial q of degree less than p, it is NOT true that q(M) = zero matrix, then describe M. (How big are the blocks, etc.)
A. The answer is that M must have only one block for each eigenvalue. We have to prove both containments.
Let q(x) be the product (x-f)^(m_f) for some numbers f (every polynomial other than zero can be rescaled to look like this, since we're still over an algebraically closed field). Let N be a Jordan block in M, with e along the diagonal. Then q(N) = a product (N - f1)^(m_f). Each term in this product is upper triangular, with e-f's along the diagonal; the only way a term can be noninvertible is for f=e. The only way for the product to be zero is therefore for the f=e term to be zero, and that only happens if m_f is at least the size of N.
Therefore for q(M)=0, it is necessary and sufficient that q(x) be divisible by (x-e)^(m_e) where m_e is the size of the largest e-block in M.
If M has only one block for each eigenvalue, then m_e is the multiplicity of x-e in M's characteristic polynomial, so this condition becomes "M's characteristic polynomial must divide q". That proves one direction.
Conversely, if for some f M has more than one block, then m_f is smaller than the power of x-f in M's characteristic polynomial, so q(x) = product of (x-e)^(m_e) gives a polynomial of degree less than that of the characteristic polynomial.
6. Let M be a matrix as in #5. Show that T is a polynomial in M if, and only if, T commutes with M. (In fact the iff stuff goes deeper. If this statement is true about some M, then M is as in #5.)
A. If T is a polynomial in M, then obviously T commutes with M.
A dull, now familiar, calculation shows that the matrices that commute with M are the block diagonal matrices such that each block is upper triangular and banded.
We give two proofs of the reverse. The first is hands-on but long, the second is short but unsatisfying.
Proof 1: First we show that if A,B are upper triangular and banded, and the entries just above A's diagonal are nonzero, then B is a polynomial in A. Let A' be A - a_{11}*(the identity), a polynomial in A, and now with zero diagonal. Now reduce B to zero in stages: if the diagonal isn't zero, then subtract off the right multiple of the identity. If the band just above the diagonal isn't zero, subtract off the right multiple of A'. If the band above that isn't zero, subtract off the right multiple of A'^2. Then A'^3, etc. (The important thing is that A'^k has k zero bands, and then a nonzero band.) When we're done, we find that B was a sum of constants times powers of A', and is therefore a polynomial in A.
If e is an eigenvalue of M, let p_e be the characteristic polynomial of M divided by the (x-e) terms. Let N be a Jordan block of M with f's on the diagonal. Then p_e(N)=0 if and only if f is not e. If f=e, then p_e(N) is banded with nonzero diagonal.
Annoyingly, we have to split into two cases. If p_e(N) is zero just above the diagonal, let q_e(x) = x p_e(x). Otherwise let q_e(x) = p_e(x). Either way, we find that q_e(M) is block diagonal, with all zero blocks except for e block, which is nonzero just above the diagonal.
We're now ready to show that if T commutes with M, then T is a polynomial in M. Let M_e be the Jordan block of M with e's on the diagonal, and T_e be the corresponding block of T. Then by the "First we show" paragraph above, we can write T_e as some polynomial f_e(x) for x=q_e(M_e).
Therefore if we let F(x) = sum_e f_e(q_e(x)), we find F(M) = T.
All right, I admit now that this is the hardest homework question we've had so far. But you know, it's just matrix multiplication, when you get down to it...
Proof 2: we've already computed which T's commute with M, and the dimension of that space is dim V. Now think about the powers 1, M, M^2, ... M^(dim V - 1). Each of these commutes with M, and if this set is linearly dependent, then M satisfies a polynomial of degree < dim V. But we know it doesn't. So this gives a (dim V)-dimensional space of matrices that commute with M. Ta-da!