FACTORIZATION OF PERMUTED SQUARE MATRIX INTO LU As in Chapter 1 of the text, we are studying the factorization of an arbitrary matrix into a permutation matrix times a a lower triangular matrix times an upper triangular matrix. MOTIVATION: Given an arbitrary system of n linear equations in n unknowns, Mx = a, then if PM = LU, we have LUx = Pa, which can be quickly solved by first finding w such that Lw = Pa and then x such that Ux = w The solution of both of these triangular systems (for w and then for x) entails no more arithmetic effort that multiplying a single n dimenstional vector, a, by a single nxn matrix, such as M^-1. So the factorization into LU obviates the need for an inverse. It also often has other advantages in speed, accuracy, and generality, because a permutation matrix P and the triangular matrices L and U can be found even when M is not invertible. ============================================================= OVERVIEW OF THE MAIN LOOP: At a typical intermediate stage of such a factorization of a 6x6 matrix, we have a product such as 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * * * 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * * * 1 0 0 0 0 0 1 0 0 0 0 0 * * * * (#1) * * 0 1 0 0 0 0 0 1 0 0 0 0 * * * * * * 0 0 1 0 0 0 0 0 1 0 0 0 * * * * * * 0 0 0 1 0 0 0 0 0 1 0 0 * * * * where the asterisks denote arbitrary scalars. (The identity matrix which we have placed in the middle here is, of course, not essential, but we insert it to provide a migration path for manipulations between the two matrices.) At this stage, we may see the beginnings of a lower triangular matrix in the first two columns of the leftmost factor, and the beginnings of an upper triangular matrix in the first two rows of the rightmost factor. For specificity, we now introduce letters to denote certain specific entries: 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * * * 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * a b 1 0 0 0 0 0 1 0 0 0 0 0 e f g h (#1') * * 0 1 0 0 0 0 0 1 0 0 0 0 * * * * * * 0 0 1 0 0 0 0 0 1 0 0 0 * * * * c d 0 0 0 1 0 0 0 0 0 1 0 0 j k l m The 3rd diagonal entry of the rightmost matrix is a very important position, called the "pivot". Subjectively, our next goal is to "move" the entries below this pivot position from the rightmost matrix to the leftmost matrix. This will require dividing these entries by the pivot. To facilitate this, we may find it necessary or desirable to swap a pair of rows in the rightmost matrix. We shall show that this can be accomplished by premultiplying the product of all three matrices in expression (#1) by an appropriate permutation matrix, P, which swaps the 3rd row with some subsequent row. For example, if we wish to swap the 3rd row with the 6th row, then the matrix which accomplishes this is the transposition matrix, 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 = P (#2) 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 We shall later verify that P times (#1') can be expressed in this form: 1 0 0 0 0 0 1 0 0 0 0 0 " " " " " " " 1 0 0 0 0 0 1 0 0 0 0 0 " " " " " c d 1 0 0 0 0 0 1 0 0 0 0 0 j k l m (#3') " " 0 1 0 0 0 0 0 1 0 0 0 0 " " " " " " 0 0 1 0 0 0 0 0 1 0 0 0 " " " " a b 0 0 0 1 0 0 0 0 0 1 0 0 e f g h where each quotation mark represents an element that is unchanged from (#3'). The purpose of this row swap is to ensure that j divides each element beneath it, so that (#3') can be rewritten as this: 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * * * 1 0 0 0 0 0 1 0 0 0 0 0 * * * * * c d 1 0 0 0 0 0 1 0 0 0 0 0 j * * * (#3) * * 0 1 0 0 0 0 0 1 0 0 0 0 pj * * * * * 0 0 1 0 0 0 0 0 1 0 0 0 qj * * * a b 0 0 0 1 0 0 0 0 0 1 0 0 rj * * * We will also show that we can factor the rightmost matrix in this expression into 1 0 0 0 0 0 * * * * * * 0 1 0 0 0 0 0 * * * * * 0 0 1 0 0 0 0 0 j * * * (#4) 0 0 p 1 0 0 0 0 0 * * * 0 0 q 0 1 0 0 0 0 * * * 0 0 r 0 0 1 0 0 0 * * * We may then multiply out the product of the leftmost matrix times the middle matrix, 1 0 0 0 0 0 1 0 0 0 0 0 * 1 0 0 0 0 0 1 0 0 0 0 * * 1 0 0 0 0 0 1 0 0 0 (#5) * * 0 1 0 0 0 0 p 1 0 0 * * 0 0 1 0 0 0 q 0 1 0 * * 0 0 0 1 0 0 r 0 0 1 which equals 1 0 0 0 0 0 1 0 0 0 0 0 * 1 0 0 0 0 0 1 0 0 0 0 * * 1 0 0 0 0 0 1 0 0 0 (#6) * * p 1 0 0 0 0 0 1 0 0 * * q 0 1 0 0 0 0 0 1 0 * * r 0 0 1 0 0 0 0 0 1 In other words, after premultiplying expression (#1) by an appropriate permutation matrix such as (#2), we are able to rewrite the product in the following form: 1 0 0 0 0 0 1 0 0 0 0 0 " " " " " " " 1 0 0 0 0 0 1 0 0 0 0 0 " " " " " c d 1 0 0 0 0 0 1 0 0 0 0 0 * * * * (#7) " " * 1 0 0 0 0 0 1 0 0 0 0 0 * * * " " * 0 1 0 0 0 0 0 1 0 0 0 0 * * * a b * 0 0 1 0 0 0 0 0 1 0 0 0 * * * Here each quotation mark represents an entry that is unchanged from expression (#1). OVERVIEW OF THE WHOLE PROGRAM Initially, we have an arbitrary matrix, M, which we preceed with two identity matrices. We then apply the above methodology to the first column, then to the second column, then to the third column (as detailed above), ..., until all columns have bee processed and the final expression is the product of a lower triangular matrix L, and identity I, and an upper triangular matrix, U. The matrix L has ones on the diagonal. As each column is processed, it may entail the introduction of another permutation matrix on the left. Each such permutation matrix is either an identity or a simple "swap" of the row containing the current pivot position with a lower row. However, the product of all these permutation matrices can yield an arbitrary permutation. Notice that this permutation matrix cannot be predicted in advance; it is constructed along with L and U in the course of the computations done by the algorithm. VERIFICATION OF DETAILS. We now verify assertions stated above. To check the validity of #4, multiply it out to obtain the rightmost matrix in expression #3. To check the validity of expression #6, multiply out expression #5. To check the validity of expressions #3' and #4, we offer more details, and call them Lemmas: =================================================================== Lemma 3': (#2) (#1') = (#3') We partition the permutation matrix as I 0 0 P where the upper I is 2x2, the P is 4x4, and the zeroes represent all-zero matrices of sizes 2x4 and 4x2 respectively. We partition each of the matrices in expression (#1') similarly, to obtain the following expression for the combined product: I O L O I O U V O P A I 0 I 0 W = I O L O I O U V O I PA P 0 I 0 W = I O L O I O U V O I PA I 0 P 0 W = I O L O I O U V O I PA I 0 I 0 PW =================================================================== Lemma 4: The permutation P (exemplified by #2) can always be chosen to ensure that every element below the pivot is divisible by the pivot. Formal proof: Since we are working in a field, every nonzero element is a divisor of everything, so the problem can be formally resolved simply by permutating a nonzero element into the pivot position, if any such candidate exists. If not, then all elements on or below the diagonal in the present column are zero, and the problem is solved by the identity permutation and the fact that zero is a factor of zero. Numerical Issues: The formal proof given above is sufficient for some fields, including Z2. It may also be viewed as sufficient for the rationals, Q, assuming that each rational is represented as a quotient of integers with arbitrary precision. Although there are a few computer programs which support such arithmetic, they are rare. Most computers treat rationals as real numbers with limited precision. (In many applications, the raw data has accuracy far less than the precision of the computer's arithmetic. But a careless program can still get meaningless or even misleading answers by assuming infinite precision in situations where it really isn't there.) When the field includes the real numbers, numerical accuracy can become an important concern. If a and b are positive numbers, each with the same (small) percentage error, then the percentage error of the sum is as good as the percentage error of the summands. The product, ab, and the quotient, b/a, each has percentage error which is perhaps twice the percentage error of a and b. However, the difference, a-b, can have an arbitrarily large percentage error, especially when a and b are large but their difference is small. Sums, differences, products, and quotients all occur in the present context. Every element in the lower right portion of the rightmost matrix being processed is affected by some multiple of the reciprocal of the pivot. The conventional rule for choosing the pivot is to select the candidate of largest magnitude. This offers two advantages: 1) If, at some earlier stage of the calculation, all elements were of roughly the same order of magnitude and had the same accuracy, then an element which now has small magnitude is much more likely to be the result of a difference between big numbers then a maximal element. Hence, the maximal element is likely to have much better relative accuracy then a competitor of small magnitude. 2). As the pivot element appears in many denominators, giving it large magnitude creates a tendency for the terms in which it is a denominator to have smaller magnitude than the terms to which they are destined to be added. This tendency reduces the risk of getting small elements of small magnitudes as differences (or signed sums) of terms with large but very close magnitudes. However, the details can be quite complicated, and well beyond the scope of this course. Profs. Demmel and Kahan are renowned experts on such matters; Demmel is currently offering a graduate course on this topic. COLUMN PERMUTATIONS Another variation of the factorization method discussed here allows permutations of columns as well as rows. Such column permutations are accomplished by postmultiplying expression (#1) by permutation matrices, Q. Like P in (#2), Q is also chosen to have an identity submatrix in its top left corner, so that columns preceding the pivot location are unaffected. Like P, Q is also either an identity or a swap between the pivot column and some subsequent column. By appropriate choice of P and Q, one can easily swap any entry in the lower right submatrix into the pivot position. When the field includes the real numbers, the conventional choice is again a candidate of largest magnitude. In other fields, such as Z2, the choice is a nonzero element, if any such candidate exists. This ensures that the pivot element now divides not only all subsequent elements in its column, but also all subsequent elements in its row. This enables one to "factor out" the pivot elements into the middle of the three matrices, so that expression #1 becomes 1 0 0 0 0 0 * 0 0 0 0 0 1 s s s s s s 1 0 0 0 0 0 * 0 0 0 0 0 1 s s s s s s 1 0 0 0 0 0 1 0 0 0 0 0 * * * * s s 0 1 0 0 0 0 0 1 0 0 0 0 * * * * s s 0 0 1 0 0 0 0 0 1 0 0 0 * * * * s s 0 0 0 1 0 0 0 0 0 1 0 0 * * * * In the case of fields including the real numbers, each s is a "small" element, of magnitude at most 1. Using this version of the algorithm, one obtains the theorem that an arbitrary matrix, M, after permutations by appropriate matrices P and Q, can be factored as PMQ = LDU where L is lower triangular, D is diagonal, and U is upper triangular, both L and U have all ones on their diagonals, and (in the real or complex field) only elements of magnitudes <= 1 elsewhere. In the other version of the algorithm, one restricts Q = D = I, but then the diagonal of U becomes arbitrary and there is no constraint on the magnitude of U's above-diagonal elements. =================================================================== MEMORY REQUIREMENTS Excluding the permutation matrices P and Q, observe that at each stage of the algorithm, the total number of "arbitrary" (i.e., not constrained to be zero or one) elements is always precisely n^2. For this reason, it is quite possible to do the entire calculation "in place", with the off-diagonal elements of L and U cohabiting the same nxn array. This memory assignment is quite consistent with some operations, such as the "row swap" instigated by premultiplication by the permutation matrix, P. Although this may be computationally convenient, it is conceptually much better to view L and U or DU as separate entities. =================================================================== ARITHMETIC COMPUTATIONAL REQUIREMENTS These are discussed in out textbook in considerable detail. Be sure to read carefully! Most of the textbook's discussion focusses on the transition from expression #3 to expression #4, because that is indeed where most of the arithmetic occurs. However, on many types of computers, the ratio of the time for a typical addition and multiplication to the time of a typical memory fetch has been decreasing in recent years. So to obtain more accurate running-time estimates, one would need to put more emphasis on the permutations then our texbook does. /* Also see the file entitled "handout.Sep7" */