# Superfluous letters

Increpare (i.e., Stephen Lavelle), creator of Stephen’s Sausage Roll, has an interesting new short game, Superfluous letters, apparently the result of a serendipitous misunderstanding. It’s sort of a twist on Lights-Out: there are 26 letters drawn in 5x3 grids, and the goal is to find combinations of letters that xor together to any of the other letters. Every time you find such a combination, the found letter is disabled and you get a point. What’s the maximum score you can get?

In this post, I’ll describe a polynomial time algorithm to solve the game.

If you haven’t played with the game yet, though, please, go and futz with it for a while first! Here’s a link again: Superfluous letters.

## 1. Solving the game

Ok, now that you are back, you probably found some formulae, like $T+U+V=I$. If the game didn’t immediately disable the $I$, there’d also be the formula $I+T+U+V=\mathrm{\varnothing}$ (blank), which you can get by adding (xoring) $I$ to both sides. For a similar reason, we can get associated formulae $I+U+V=T$, $I+T+V=U$, and $I+T+U=V$ for all the other letters appearing in the formula. But, because letters are disabled, once you use the formula in any of its forms, it is worn out.

Now, what about $H+O+U=A$? If we use $I+T+V=U$, then we can’t use $H+O+U=A$ anymore to create $A$. At first glance, this seems like it might be the big issue for solving the game, that you’ll have to search for an order to execute formulae in. However, we can substitute the equation for $U$ into the one for $A$: $H+O+(I+T+V)=A$. So, the order we create letters actually doesn’t matter!

Now, let’s say we have a run-through of the game, and that at some
point we used the equation $I+T+V=U$. A surprising thing is that we
can instead use, say, $T+U+V=I$ and replace all of the $I$’s after
that point with $T+U+V$. If we also end up creating $T$ and $V$, we
might have to substitute other formulae for those letters. But, in
any case, we can transform the run-through into one where we create
$I$ instead of $U$ at that step. The takeaway here is that a
run-through can be described as a sequence of sets of letters that
sum to blank, so long as we aren’t picky about remembering exactly
which letters we create along the way. Importantly, the score you’d
get from the run-through doesn’t change.^{[1]}

Let $\mathcal{L}$ be the collection of subsets of letters that sum to blank, including the empty set. For instance, $\{I,T,U,V\}$ is an element of $\mathcal{L}$. A run-through gives some sequence ${A}_{1},\dots ,{A}_{k}\in \mathcal{L}$, and the max score question is about how big $k$ can be.

The idea is going to be that there is a meta Lights-Out game going on with the formulae themselves. I’m not sure how to talk about this without actually invoking some linear algebra, but here are a few facts about $\mathcal{L}$.

- We can use symmetric difference (xor for sets) as an addition-like operation on $\mathcal{L}$. As an example, we define $\{I,T,U,V\}+\{A,H,O,U\}=\{A,H,I,T,O,V\}$, the set of letters that are in exactly one of the two sets. This is supposed to correspond to the substitution operation we applied to a formula with a disabled letter.
- $\mathcal{L}$ is a finite set, so there is a minimal set ${A}_{1},\dots ,{A}_{k}$ that generates all of $\mathcal{L}$ (that is, every element of $\mathcal{L}$ is a sum of some of the ${A}_{1},\dots ,{A}_{k}$).
- No matter how you find a minimal generating set, $k$ will
always be the same number. This is known as the
*dimension*of $\mathcal{L}$. (This takes a little work to prove, but it is a standard result of linear algebra.) - None of the elements of a minimal generating set is a sum of any of the others. For instance, if ${A}_{5}={A}_{1}+{A}_{2}+{A}_{6}$, then we can remove ${A}_{5}$ from the generating set to get an even smaller generating set.

Consider a run-through ${A}_{1},\dots ,{A}_{k}$. The first observation is that this must be a generating set for $\mathcal{L}$, since if there is some ${A}^{\prime}$ in $\mathcal{L}$ that’s not generated by them, we can append ${A}^{\prime}$ to the run-through. Likely, we’ll have to eliminate a bunch of letters from ${A}^{\prime}$ that the run-through has disabled so far, but this is not a problem: since ${A}^{\prime}$ is not generated by ${A}_{1},\dots ,{A}_{k}$, the version of ${A}^{\prime}$ after elimination will not be empty, and since in addition no letter is completely blank, it will contain at least two letters.

The second observation is that in a maximal run-through,
${A}_{1},\dots ,{A}_{k}$ is a *minimal* generating set for
$\mathcal{L}$. If one of the sets were a sum of earlier sets, for
example if ${A}_{6}={A}_{2}+{A}_{3}+{A}_{5}$, then since ${A}_{2}$ creates a letter that
is not present for any of the future formulae, that letter would
have to be used in ${A}_{6}$, too, which is impossible. In general, if
any of the sets is a sum of the other sets, then by rearranging the
equation we get a set that’s a sum of earlier sets.

**Theorem.** *Every maximal run-through has the same
length, the dimension of $\mathcal{L}$.*

Anyway, this means that if we have any minimal generating set ${A}_{1},\dots ,{A}_{k}$, we can use elimination to get a sequence of moves from it, and with it we’ll achieve the maximum score.

## 2. The algorithm

Getting a generating set for $\mathcal{L}$ quickly needs some acquaintance with linear algebra. If you are happy with waiting for your computer, though, this will do:

- let $\mathcal{U}$ start as an empty subset of $\mathcal{L}$.
for every subset $A$ of letters:

- check if the letters in $A$ xor to blank. if they don’t, continue with the next $A$.
- check if $A$ is already a sum of sets from $\mathcal{U}$. if it is, continue with the next $A$.
- put $A$ into $\mathcal{U}$

- now $\mathcal{U}$ is a minimal generating set.

Let $\mathcal{P}$ be a $15\times 26$ matrix of $0$’s and $1$’s where each column corresponds to the pixels of a different letter. An element $A$ of $\mathcal{L}$ corresponds to a 26-element vector $v$ of $0$’s and $1$’s with ${v}_{i}=1$ if and only if the $i$th letter is in $A$, which means this vector satisfies $\mathcal{P}v=0$. That is, $\mathcal{L}$ corresponds to the nullspace of $\mathcal{P}$.

In Mathematica, $\text{NullSpace}[\mathcal{P},\text{Modulus}\to 2]$ gives a minimal generating set for the nullspace, given as a matrix with one generator per row. The algorithm is $O({n}^{3})$ with $n$ the maximum dimension of the matrix $\mathcal{P}$.

Then, we can convert this minimal generating set into a list of moves by (1) always choosing the first letter in a given set and (2) eliminating that letter from the remaining sets. This happens to be known as Gaussian elimination, or row reduction. Mathematica also happens to have this function built-in, so this is the complete algorithm for solving the game:

$$\text{RowReduce}[\text{NullSpace}[P,\text{Modulus}\to 2],\text{Modulus}\to 2]$$ with the understanding that we read off the rows of the result to get a run-through.It doesn’t give a speed run, but the program does run quickly. Here’s a solution it came up with:

$$\begin{array}{rl}\text{M}+\text{N}+\text{O}+\text{U}+\text{W}& =\text{A}\\ \text{J}+\text{K}+\text{U}+\text{W}+\text{X}+\text{Y}& =\text{B}\\ \text{K}+\text{N}+\text{O}+\text{T}+\text{U}+\text{W}+\text{X}+\text{Y}+\text{Z}& =\text{C}\\ \text{J}+\text{K}+\text{M}+\text{O}+\text{S}+\text{T}+\text{U}+\text{V}+\text{W}+\text{X}+\text{Z}& =\text{D}\\ \text{S}+\text{V}+\text{W}+\text{X}+\text{Y}& =\text{E}\\ \text{J}+\text{M}+\text{Q}+\text{R}+\text{S}+\text{T}+\text{U}+\text{V}+\text{W}+\text{Z}& =\text{F}\\ \text{J}+\text{M}+\text{O}+\text{R}+\text{S}+\text{T}+\text{V}+\text{X}+\text{Y}+\text{Z}& =\text{G}\\ \text{M}+\text{N}+\text{W}& =\text{H}\\ \text{T}+\text{U}+\text{V}& =\text{I}\\ \text{N}+\text{O}+\text{R}+\text{T}+\text{U}+\text{W}+\text{X}+\text{Y}+\text{Z}& =\text{L}\\ \text{Q}+\text{U}+\text{W}+\text{X}+\text{Y}& =\text{P}\end{array}$$ giving a score of $11$.## 3. Some analysis

The dimension of $\mathcal{L}$ is the dimension of the nullspace of $\mathcal{P}$, and the dimension of a nullspace is $26-\mathrm{rank}\mathcal{P}$, where the rank of $\mathcal{P}$ is a minimal generating set for the pictures of the letters themselves. There are 15 pixels, and this particular set of letters ends up fully exercising all the pixels. Mathematica agrees: the rank of $\mathcal{P}$ is 15. To complete the numerology, $26-15=11$, the maximal score. (Put a dual way, we have a system of $15$ equations corresponding to each pixel and of $26$ variables corresponding to each letter. Each equation, if it is not superfluous, decreases the degrees of freedom by one. So, $26-15$ is a priori a lower bound for the maximal score.)

If the pictures had been less independent of each other, then the maximal score could have been larger.

What if Stephen were $390$ monkeys at $390$ color pickers, and the letters were completely random? What are the chances that we would be in this universe where the maximal score is only $11$?

There is a recurrence describing the probability that a $n\times k$ matrix has rank $r$.

$$p(n,k,r)=\{\begin{array}{ll}0& \text{if}nr\text{or}kr\\ (1/2{)}^{nk}& \text{if}r=0\\ (1-(1/2{)}^{n-r+1})p(n,k-1,r-1)+(1/2{)}^{n-r}p(n,k-1,r)\end{array}$$ This comes from realizing that we can always assume that the first $k-1$ columns have been row-reduced through a change-of-basis, and change of basis does not affect the distribution.The value of $p(15,26,15)$ is approximately $99.95\mathrm{\%}$. So, fairly likely for the maximum score to be $11$. In fact, the expected value of the maximum score is just $11.0005$.

(An interesting fact while we are here: the limit of $p(n,n,n)$ as $n\to \mathrm{\infty}$ is not $0$ as it is over $\mathbb{R}$. Modulo $2$, it comes out to $28.9\mathrm{\%}$! Maybe it seems more reasonable if we realize this is the same as saying that the determinant of a square matrix of all $1$’s and $0$’s is odd about a quarter of the time. The expected rank of the matrix is pretty high though, about $n-0.85$.)

A consequence of the rank being $15$ and there being $15$ pixels is that you can draw anything.

A strange property about the letters as designed is that every letter can be drawn using the other letters. In linear algebra language, this is saying that if you remove any single letter, you can still draw the rest. For a given letter, the probability that you can still draw everything in a monkey-drawn alphabet is $p(15,25,15)\approx 99.9\mathrm{\%}$. I wasn’t able to work out the inclusion-exclusion for the probability that every letter has this property, but I’d expect it to be rather high.

^{[1]}There is one technicality I’m glossing over. Maybe there’s a formula that, for some unlucky reason, every permutation of the letters prematurely creates the wrong letter. There’s a way around this, which is to create one of these premature letters instead, but then there is some additional (but slight) complexity from needing to merge the old formula back into the run-through. If you care about this sort of thing, you’ll probably figure it out.