Evolutionary Sorting

This is an experiment whose goal is to explore the possibility of a computer learning how to sort numbers with minimal human intervention. This particular program uses a genetic algorithm which spawns many different sorting algorithms and uses an evolutionary process that we'll explain below in order to slowly improve them.

The Pieces

Before getting into the evolutionary details, we'll explain what a "sorting algorithm" is in this context. In order to keep things simple, all of our algorithms will act on a given sequence of numbers in a similar manner to that of a turing machine. The algorithm starts at a particular number in the sequence and at every step it can decide to either:

  • Move over to the right without affecting the sequence. (MR)
  • Move over to the left without affecting the sequence. (ML)
  • Switch between the current number and the one on it's right. (SR)
For all of these operations, the number to the right of the last number is the first one and vice versa. In order to determine which one of the operations to carry out, the algorithm can use one of the following if statements:
  • If the number to the right is bigger. (IBR)
  • If the number to the left is bigger. (IBL)
  • If we are at the end of the sequence. (IAE)
Finally, in order to increase the number of possibilities, there is an operation PROC which is like an if statement except that both possibilities are executed.

What we mean by "Algorithm"

In our context, an algorithm is a binary tree whose nodes consist of the above components. For example:

At each step, algorithm depicted above first checks if it's at the end. If so, it checks if the number to it's right is bigger and either switches this number with the one on it's right or the one on it's left. If it is not at the end, it switches with the one on it's right and then moves to the right.

The Evolutionary Step

We can now describe how our program creates an algorithm that can actually sort numbers. Initially, we randomly generate a large number of trees (algorithms) as above. This is generation 1. We then evaluate each tree by running it about 100 times on a random sequence and checking how much more ordered the sequence gets. Since we also want our trees to be fairly compact we also hand out penalties based on the number of nodes.

Now comes the evolutionary step. In order to create generation 2 we select many pairs of generation one trees in a way such that trees with a higher evaluation have a better chance of getting picked. For each pair (A,B) we create two children by the "crossover" method. In our scenario this means that we randomly select nodes nA from A and nB from B and swap their subtrees. The two trees we are left with are the children. Of course, we also induce "mutations" with a small but non zero probability. For example, there is some probability that a SR command will turn into a SL command.

After creating generation 2, we repeat the process to create generation 3 and so on until we either find an algorithm that can perfectly sort or the universe decays into a static nothingness.

Results

At this point you're probably wondering if this is actually going anywhere and why I didn't just look up "how to sort" on stackoverflow. Well, the interesting thing is that after only about 10 generation, the program usually spits out something like this:

After looking at this for minute we notice that this is in fact a very well known sorting algorithm!

The source code for this project can be found here