Sep 7: IMPORTANT DEFINITIONS: A set of vectors, v[i], is (linearly) DEPENDENT if there exist scalars c[i], not all zero, such that SUM c[i]v[i] = 0 A set of vectors which is not (linearly) dependent is said to be (linearly) INDEPENDENT. COMMENTS: Nonlinear dependencies are so obscure and little-used that in Strang's textbook, he simply uses the "dependent" and "independent", with the adjective "linearly" understood. ERASURE-CORRECTING BINARY CODE(S): This can provide an illustration of these definitions, and of the LU factorization methods discussed earlier. It also provides a simple example of how BINARY matrices can be useful in digital communications and computer memory systems. Consider this 5 x 16 binary matrix: 1111010110010000 0111101011001000 H = 0011110101100100 1110101100100010 1010011011100001 Given any values of the 11 bits x1, x2,....,x11, we can find values of the last five bits x12,x13,...x16 such that H x = 0 The first 11 bits are called "message bits"; the last five are called "check bits". The set of all 16-bit binary vectors, x, which satisfy these constraints is called the CODE. The vectors in this code are called CODEWORDS. This code is the null space of the matrix H. It contains 2^11 = 2048 codewords. Given an 11-bit message, we may first "encode" it by appending to those 11 bits an additional five "check" bits to obtain a 16-bit codeword, which is then stored or transmitted in a way which is less then complete reliable. Suppose that a small number of the 16 bit values are corrupted in such a way that they are "erased" (i.e., replaced with question marks). Question: In general, under what circumstances can the recipient of the corrupted block be sure that he has recovered the original message correctly? Answer: If the corresponding columns of the H matrix are (linearly) independent. This is because the recipient can construct a set of linear binary equations in the erased bits, and the coefficient matrix of this system of linear equations are the columns of the H matrix corresponding to the locations of the erasures. /* The class picked an 11-bit message vector which I encoded. We then also picked a set of five positions to "erase". I wrote down the relevant set of 5 binary equations in the 5 unknown bits, and factored the relevant 5x5 binary matrix into an LU form, and tried to solve. The example turned out to have two solutions, which agreed in 1 of the five erased bits but differed in the other four. This example showed that this coding technique is limited; if there are too many erasures, then it may be impossible to recover the initial codeword uniquely; this particular case of 5 erasures was too many. */ Henceforth, assume the particular H matrix specified above. Question: Is there any single erasure which cannot be uniquely corrected? Answer: No. A single (column) vector is (linearly) "dependent" only if it is zero, and this H matrix has no all-zero column. Question: Is there any pair of erasures which cannot be uniquely corrected? Answer: No. A pair of nonzero vectors, x and y, is (linearly) dependent only if there are scalars a and b, not both zero, such that ax + by + 0 If either a or b (but not both) were zero, then we could deduce that y or x = 0, contradicting our prior observation that all columns of H are nonzero. If both a and b are nonzero, in the binary field, they must both be 1, whence we have x = y. But we observe that all columns of H are distinct. Question: Is there any pattern of 3 erasures which cannot be uniquely corrected? Answer: No. The sum of all of the rows of the H matrix is 1111111111111111 So this is also a linear constraint on the code space, and it says that if SUM c[i]x[i] = 0, then SUM c[i] = 0, which means there are an EVEN number of nonzero c's. Question: Is there any pattern of 4 erasures which cannot be uniquely corrected? Answer: YES. For example, 1st, 12th, 15th, and 16th. So the properties of H which ensure that any pattern of 1,2,or 3 erasures can be uniquely corrected are these: 1. All columns are distinct 2. The all-ones vector is in the row space 3. All columns are nonzero (Note that property 2 immediately implies property 3.) So instead of the H shown above, one might begin by writing down a 4x16 matrix whose columns are all 16 4-bit binary column vectors, in any order, and then appending a fifth row which is the all-ones vector. This would be the same as the H matrix shown above, except for premultiplication by some invertible 4x4 matrix (which can't change the null space, and some permutation of the columns). The H matrix listed above has its columns permutated in such a way that a 5x5 identity matrix appears on its right. This makes it easier to explain that the 16-dimensional vectors in the null space can conveniently be regarded as consisting of an initial eleven "message" bits, followed by five "check" bits. The H matrix listed above also has the additional property that its row space is the same as the row space of the following matrix, whose second, third, and fourth rows are all shifts of its first row (a property which can be helpful in simplifying the circuitry of certain kinds of implementations): 1111010110010000 0111101011001000 H' = 0011110101100100 0001111010110010 1111111111111111 ANOTHER QUESTION: If 4 erasures can be uncorrectable, how is it that sometimes 5 erasures are correctable? ANSWER: Whether an erasure pattern can be corrected depends BOTH on how many bits are erased and on where the particular erasures happen to occur. With the particular H matrix shown above, if there are s erasures, and if they are distributed randomly among the 16 bits, then here are the probabilities: s Prob uniquely correctable Prob not uniquely correctable 1 1 0 2 1 0 3 1 0 4 12/13 1/13 5 8/13 5/13 >=6 0 1 ###################################################################