%pwd
/home/bosco_d/fac/berlek/public_html/CLASS.110
%ls -F
mbox
student.eadr
HOMEWORK/
L.U
lectures
%cd HOMEWORK
%ls -F
Sep2
Sep9
%cat Sep9
Assignments due Sep 9:
I have written up a description of the factorization of a
permuted arbitrary square matrix into a product of triangular
matrices. It appears in the CLASS.110 directory.
READ IT.
=========================================================
On Sep 2, the class constructed the following binary matrix:
011100
011101
001001 = A
101110
111100
011010
and the following 6-dimensional binary vector:
1
1
0 = b
1
0
1
Recall that the BINARY field has only two elements,
0 and 1, and that 1+1 = 0
/* assignment continues on next page */
Problem 1: For the 6x6 binary matrix, A, find a permutation
matrix P, a lower-diagonal matrix, L, and an upper diagonal
matrix U, such that PA = UL.
Problem 2: Find all 6-dimensional binary vectors which solve
the equation
A x = b
(Hint: Write PAx = LUx = Pb. First find all vectors y
such that Ly = PB, and then solve Ux = y.)
Problem 3: If possible, find another 6-dimensional binary vector,
c, for which the equation
Ax = c has a different number of solutions than Ax = b.
Problem 4: How many different 6-dimensional binary vectors are
there in the column space of A? In the null space of A?
=============================================================
Read Sec 2.1 of the text.
Do the following exercises on pages 69-70:
2.1.5, 2.1.6, 2.1.7, 2.1.8, and 2.1.9.
Read Sections 2.2, 2.3, and 2.4
#############################################################
%cat L.u
FACTORIZATION OF PERMUTED SQUARE MATRIX INTO LU
As in Chapter 1 of the text,
we are studying the factorization of an
arbitrary matrix into a permutation matrix times a
a lower triangular matrix times an upper triangular matrix.
MOTIVATION:
Given an arbitrary system of n linear equations in n unknowns,
Mx = a,
then if PM = LU, we have LUx = P^-1 a, which can be quickly
solved by first finding w such that
Lw = P^-1 a
and then x such that
Ux = w
The solution of both of these triangular systems (for w and then
for x) entails no more arithmetic effort that multiplying a single
n dimenstional vector, a, by a single nxn matrix, such as M^-1.
So the factorization into LU obviates the need for an inverse.
It also often has other advantages in speed, accuracy, and
generality, because a permutation matrix P and the triangular
matrices L and U can be found even when M is not invertible.
=============================================================
OVERVIEW OF THE MAIN LOOP:
At a typical intermediate stage of such a factorization of
a 6x6 matrix, we have a product such as
1 0 0 0 0 0 1 0 0 0 0 0 * * * * * *
* 1 0 0 0 0 0 1 0 0 0 0 0 * * * * *
* * 1 0 0 0 0 0 1 0 0 0 0 0 * * * * (#1)
* * 0 1 0 0 0 0 0 1 0 0 0 0 * * * *
* * 0 0 1 0 0 0 0 0 1 0 0 0 * * * *
* * 0 0 0 1 0 0 0 0 0 1 0 0 * * * *
where the asterisks denote arbitrary scalars. (The identity
matrix which we have placed in the middle here is, of course,
not essential, but we insert it to provide a migration path
for manipulations between the two matrices.)
At this stage, we may see the beginnings of
a lower triangular matrix in the first two columns of the
leftmost factor, and the beginnings of an upper triangular
matrix in the first two rows of the rightmost factor.
For specificity, we now introduce letters to denote certain
specific entries:
1 0 0 0 0 0 1 0 0 0 0 0 * * * * * *
* 1 0 0 0 0 0 1 0 0 0 0 0 * * * * *
a b 1 0 0 0 0 0 1 0 0 0 0 0 e f g h (#1')
* * 0 1 0 0 0 0 0 1 0 0 0 0 * * * *
* * 0 0 1 0 0 0 0 0 1 0 0 0 * * * *
c d 0 0 0 1 0 0 0 0 0 1 0 0 j k l m
The 3rd diagonal entry of the rightmost matrix is a very
important position, called the "pivot". Subjectively, our
next goal is to "move" the entries below this pivot position
from the rightmost matrix to the leftmost matrix. This will
require dividing these entries by the pivot. To facilitate
this, we may find it necessary or desirable to swap a pair
of rows in the rightmost matrix. We shall show that this
can be accomplished by premultiplying the product of all
three matrices in expression (#1) by an appropriate permutation
matrix, P, which swaps the 3rd row with some subsequent row.
For example, if we wish to swap the 3rd row with the 6th row,
then the matrix which accomplishes this is the transposition
matrix,
1 0 0 0 0 0
0 1 0 0 0 0
0 0 0 0 0 1 = P (#2)
0 0 0 1 0 0
0 0 0 0 1 0
0 0 1 0 0 0
We shall later verify that P times (#1') can be expressed in this
form:
1 0 0 0 0 0 1 0 0 0 0 0 " " " " " "
" 1 0 0 0 0 0 1 0 0 0 0 0 " " " " "
c d 1 0 0 0 0 0 1 0 0 0 0 0 j k l m (#3')
" " 0 1 0 0 0 0 0 1 0 0 0 0 " " " "
" " 0 0 1 0 0 0 0 0 1 0 0 0 " " " "
a b 0 0 0 1 0 0 0 0 0 1 0 0 e f g h
where each quotation mark represents an element that is
unchanged from (#3').
The purpose of this row swap is to ensure that j divides each
element beneath it, so that (#3') can be rewritten as this:
1 0 0 0 0 0 1 0 0 0 0 0 * * * * * *
* 1 0 0 0 0 0 1 0 0 0 0 0 * * * * *
c d 1 0 0 0 0 0 1 0 0 0 0 0 j * * * (#3)
* * 0 1 0 0 0 0 0 1 0 0 0 0 pj * * *
* * 0 0 1 0 0 0 0 0 1 0 0 0 qj * * *
a b 0 0 0 1 0 0 0 0 0 1 0 0 rj * * *
We will also show that we can factor the rightmost matrix
in this expression into
1 0 0 0 0 0 * * * * * *
0 1 0 0 0 0 0 * * * * *
0 1 0 0 0 0 0 0 j * * * (#4)
0 0 p 1 0 0 0 0 0 * * *
0 0 q 0 1 0 0 0 0 * * *
0 0 r 0 0 1 0 0 0 * * *
We may then multiply out the product of the leftmost matrix
times the middle matrix,
1 0 0 0 0 0 1 0 0 0 0 0
* 1 0 0 0 0 0 1 0 0 0 0
* * 1 0 0 0 0 0 1 0 0 0 (#5)
* * 0 1 0 0 0 0 p 1 0 0
* * 0 0 1 0 0 0 q 0 1 0
* * 0 0 0 1 0 0 r 0 0 1
which equals
1 0 0 0 0 0 1 0 0 0 0 0
* 1 0 0 0 0 0 1 0 0 0 0
* * 1 0 0 0 0 0 1 0 0 0 (#6)
* * p 1 0 0 0 0 0 1 0 0
* * q 0 1 0 0 0 0 0 1 0
* * r 0 0 1 0 0 0 0 0 1
In other words, after premultiplying expression (#1) by an
appropriate permutation matrix such as (#2), we are able to
rewrite the product in the following form:
1 0 0 0 0 0 1 0 0 0 0 0 " " " " " "
" 1 0 0 0 0 0 1 0 0 0 0 0 " " " " "
c d 1 0 0 0 0 0 1 0 0 0 0 0 * * * * (#7)
" " * 1 0 0 0 0 0 1 0 0 0 0 0 * * *
" " * 0 1 0 0 0 0 0 1 0 0 0 0 * * *
a b * 0 0 1 0 0 0 0 0 1 0 0 0 * * *
Here each quotation mark represents an entry that is unchanged
from expression (#1).
OVERVIEW OF THE WHOLE PROGRAM
Initially, we have an arbitrary matrix, M, which we preceed
with two identity matrices. We then apply the above methodology
to the first column, then to the second column, then to the
third column (as detailed above), ..., until all columns
have bee processed and the final expression is the product
of a lower triangular matrix L, and identity I, and an upper
triangular matrix, U. The matrix L has ones on the diagonal.
As each column is processed, it may entail the introduction of
another permutation matrix on the left. Each such permutation
matrix is either an identity or a simple "swap" of the row
containing the current pivot position with a lower row.
However, the product of all these permutation matrices can
yield an arbitrary permutation. Notice that this permutation
matrix cannot be predicted in advance; it is constructed along
with L and U in the course of the computations done by the
algorithm.
VERIFICATION OF DETAILS.
We now verify assertions stated above.
To check the validity of #4, multiply it out to obtain
the rightmost matrix in expression #3.
To check the validity of expression #6, multiply out expression #5.
To check the validity of expressions #3' and #4, we offer more details,
and call them Lemmas:
===================================================================
Lemma 3':
(#2) (#1') = (#3')
We partition the permutation matrix as
I 0
0 P
where the upper I is 2x2, the P is 4x4, and the zeroes represent
all-zero matrices of sizes 2x4 and 4x2 respectively. We partition
each of the matrices in expression (#1') similarly, to obtain
the following expression for the combined product:
I O L O I O U V
O P A I 0 I 0 W =
I O L O I O U V
O I PA P 0 I 0 W =
I O L O I O U V
O I PA I 0 P 0 W =
I O L O I O U V
O I PA I 0 I 0 PW
===================================================================
Lemma 4:
The permutation P (exemplified by #2) can always be chosen to ensure
that every element below the pivot is divisible by the pivot.
Formal proof:
Since we are working in a field, every nonzero element is a divisor
of everything, so the problem can be formally resolved simply by
permutating a nonzero element into the pivot position, if any such
candidate exists. If not, then all elements on or below the
diagonal in the present column are zero, and the problem is solved
by the identity permutation and the fact that zero is a factor
of zero.
Numerical Issues:
The formal proof given above is sufficient for some fields,
including Z2. It may also be viewed as sufficient for the rationals,
Q, assuming that each rational is represented as a quotient of
integers with arbitrary precision. Although there are a few
computer programs which support such arithmetic, they are rare.
Most computers treat rationals
as real numbers with limited precision. (In many applications,
the raw data has accuracy far less than the precision of the
computer's arithmetic. But a careless program can still
get meaningless or even misleading answers by assuming infinite
precision in situations where it really isn't there.)
When the field includes the real numbers, numerical accuracy can
become an important concern.
If a and b are positive numbers, each with the same (small)
percentage error, then the percentage error of the sum is as
good as the percentage error of the summands. The product, ab,
and the quotient, b/a, each has percentage error which is perhaps
twice the percentage error of a and b. However, the difference,
a-b, can have an arbitrarily large percentage error, especially when
a and b are large but their difference is small.
Sums, differences, products, and quotients all occur in the present
context. Every element in the lower right portion of the rightmost
matrix being processed is affected by some multiple of the reciprocal
of the pivot. The conventional rule for choosing the pivot is to
select the candidate of largest magnitude. This offers two advantages:
1) If, at some earlier stage of the calculation, all elements were
of roughly the same order of magnitude and had the same accuracy,
then an element which now has small magnitude is much more likely
to be the result of a difference between big numbers
then a maximal element. Hence, the maximal element is likely to
have much better relative accuracy then a competitor of small
magnitude.
2). As the pivot element appears in many denominators, giving it
large magnitude creates a tendency for the terms in which
it is a denominator to have smaller magnitude than the terms
to which they are destined to be added. This tendency reduces
the risk of getting small elements of small magnitudes as
differences (or signed sums) of terms with large but very
close magnitudes.
However, the details can be quite complicated, and well beyond the
scope of this course. Profs. Demmel and Kahan are renowned experts
on such matters; Demmel is currently offering a graduate course on
this topic.
COLUMN PERMUTATIONS
Another variation of the factorization method discussed here allows
permutations of columns as well as rows. Such column permutations
are accomplished by postmultiplying expression (#1) by permutation
matrices, Q. Like P in (#2), Q is also chosen to have an identity
submatrix in its top left corner, so that columns preceding the
pivot location are unaffected. Like P, Q is also either an identity
or a swap between the pivot column and some subsequent column.
By appropriate choice of P and Q, one can easily swap any entry
in the lower right submatrix into the pivot position. When the
field includes the real numbers, the conventional choice is again
a candidate of largest magnitude. In other fields, such as Z2,
the choice is a nonzero element, if any such candidate exists.
This ensures that the pivot element now divides not only all
subsequent elements in its column, but also all subsequent elements
in its row. This enables one to "factor out" the pivot elements
into the middle of the three matrices, so that expression #1 becomes
1 0 0 0 0 0 * 0 0 0 0 0 1 s s s s s
s 1 0 0 0 0 0 * 0 0 0 0 0 1 s s s s
s s 1 0 0 0 0 0 1 0 0 0 0 0 * * * *
s s 0 1 0 0 0 0 0 1 0 0 0 0 * * * *
s s 0 0 1 0 0 0 0 0 1 0 0 0 * * * *
s s 0 0 0 1 0 0 0 0 0 1 0 0 * * * *
In the case of fields including the real numbers, each s is
a "small" element, of magnitude at most 1.
Using this version of the algorithm, one obtains the theorem
that an arbitrary matrix, M, after permutations by appropriate
matrices P and Q, can be factored as
PMQ = LDU
where L is lower triangular, D is diagonal, and U is upper triangular,
both L and U have all ones on their diagonals, and (in the real
or complex field) only elements of magnitudes <= 1 elsewhere.
In the other version of the algorithm, one restricts Q = D = I,
but then the diagonal of U becomes arbitrary and there is no
constraint on the magnitude of U's above-diagonal elements.
===================================================================
MEMORY REQUIREMENTS
Excluding the permutation matrices P and Q, observe that at each
stage of the algorithm, the total number of "arbitrary" (i.e.,
not constrained to be zero or one) elements is always precisely
n^2. For this reason, it is quite possible to do the entire
calculation "in place", with the off-diagonal elements of L and
U cohabiting the same nxn array. This memory assignment is quite
consistent with some operations, such as the "row swap" instigated
by premultiplication by the permutation matrix, P.
Although this may be computationally convenient, it is conceptually
much better to view L and U or DU as separate entities.
===================================================================
ARITHMETIC COMPUTATIONAL REQUIREMENTS
These are discussed in out textbook in considerable detail.
Be sure to read carefully! Most of the textbook's discussion
focusses on the transition from expression #3 to expression #4,
because that is indeed where most of the arithmetic occurs.
However, on many types of computers, the ratio of the time
for a typical addition and multiplication to the time of a
typical memory fetch has been decreasing in recent years.
So to obtain more accurate running-time estimates, one would
need to put more emphasis on the permutations then our texbook does.
/* Also see the posted file entitled "L.U" */