The Todd-Coxeter algorithm

One problem in group theory is to understand a group from a presentation. This is not an easy problem in general, and there are some notable obstructions. For instance, there is no hope to find an algorithm which can tell whether a particular word is the identity since the word problem is undecidable.

However, for finite presentations, those with a finite set of generators and a finite set of relations between them, the Todd-Coxeter algorithm is able to give a list of all elements of a finite group in an (unknown) finite time. In fact, the algorithm is able to do more: enumerate the cosets of a finite-index finitely generated subgroup. Or, more specificially, it is able to find a faithful group action with a given finitely generated stabilizer.

What we will do in this note is describe the Todd-Coxeter algorithm, give a rather short implementation of it, and prove its correctness. But, to simplify the proof of the algorithm, let us first take a diversion through the theory of group actions.

There is an online implementation of the algorithm at Math tools on this site.

1. Classification of group actions

A $G$ -set $S$ for a set $S$ and a group $G$ is a homomorphism $ϕ : G \to Bij S$ — that is, a left group action. Maps between $G$ -sets are functions on the underlying sets which commute with the respective group actions.

An invariant subset $S^{'}$ of $S$ is one for which $g S^{'} \subset S^{'}$ for all $g \in G$ , and a reducible $G$ -set is one which has a nonempty invariant proper subset. An irreducible $G$ -set is also known as transitive, and is characterized by each pair $s_{1}, s_{2} \in S$ having some $g \in G$ with $g s_{1} = s_{2}$ .

If $S^{'} \subset S$ is a nonempty proper invariant subset, then $S - S^{'}$ is as well. This is because if $g s \in S^{'}$ , then $g^{- 1} (g s) \in S^{'}$ , so $s \in S^{'}$ . The orbit of an element $s \in S$ is the set $G s$ , which is an invariant subset and irreducible. Having the same orbit is an equivalence relation, and so $S$ decomposes into a disjoint union of irreducible $G$ -sets, one per equivalence class. For this reason, we will now focus on classifying an irreducible $G$ -set.

A pointed $G$ -set is one with a distinguished element $s_{0}$ , called the basepoint, and maps between pointed $G$ -sets are $G$ -set maps which map the basepoint to the basepoint. A pointed $G$ -set has a corresponding subgroup $Stab S$ consisting of all elements $g \in G$ with $g s_{0} = s_{0}$ . This is actually a functor. For $f : S \to S^{'}$ , the induced map $f_{*} : Stab S \to Stab S^{'}$ is simply an inclusion, since for $g$ stabilizing $s_{0}$ , $g f (s_{0}) = f (g s_{0}) = f (s_{0})$ so $f (s_{0})$ is stabilized as well.

Let $H$ be the stabilizer for an irreducible pointed $G$ -set $S$ . Define the $G$ -set map $f : G / H \to S$ by $f (g H) = g s_{0}$ , where $G / H$ is the set of left cosets of $H$ , with the $G$ -action defined by left multiplication, and with $H$ as the basepoint. This is well-defined since $H$ is the stabilizer of $s_{0}$ , and injective for the same reason. Given $s \in S$ , let $g \in G$ be such that $g s_{0} = s$ , and so $f (g H) = g s_{0} = s$ , hence $f$ is surjective. Therefore, $f$ is a $G$ -set isomorphism. Thus, every irreducible pointed $G$ -set is isomorphic to the coset $G$ -set $G / H$ for some subgroup $H$ . This completes the classification.

In case this doesn’t seem recognizable: a corollary is that the cardinality of an orbit is the index of the stabilizer, so the cardinality of an orbit times the order of the stabilizer equals the order of $G$ .

There are still a few questions which should be answered. What is the importance of the basepoint? Suppose $s_{0}, s_{1} \in S$ , and $H_{0}$ is the stabilizer of $s_{0}$ , $H_{1}$ is the stabilizer of $s_{1}$ , and $g \in G$ is such that $g s_{0} = s_{1}$ . Then for $h \in H_{1}$ , $(g^{- 1} h g) s_{0} = g^{- 1} s_{1} = s_{0}$ , so $g^{- 1} h g$ is in $H_{0}$ . By symmetry, $H_{0}$ and $H_{1}$ are conjugate subgroups of one another.

What are the automorphisms of an irreducible $G$ -set? Let $f : G / H \to G / H$ be an automorphism (not pointed — if it were pointed, then it would be the identity). Since $f (g H) = g f (H)$ for all $g \in G$ , there is some $a \in G$ such that $f (g H) = g a h$ , so $f$ is determined by $a$ . Furthermore, for $h \in H$ , since $f (h H)$ is both $a H$ and $h a H$ , we have $a h^{'} = h a$ for some $h^{'} \in H$ , which implies $a$ is in the normalizer of $H$ . Conversely, a map $f (g H) = g a H$ is well-defined for $a \in N (H)$ . The only such maps which are the identity are when $a \in H$ , hence $Aut (G / H)$ is isomorphic to $N (H) / H$ . In particular, when $H$ is a normal subgroup, $Aut (G / H)$ is $G / H$ itself. The converse follows at least for $G / H$ finite, since then $H$ is a stabilizer for all of $G / H$ .

While it is something of a tautology to say that $G / H$ is a $G / H$ -set, it is worth pointing out because whenever a pointed $G$ -set has $H$ as a stabilizer, and $H$ is a normal subgroup, then that $G$ -set can also be regarded as a $G / H$ -set with the same basepoint. In general, if $N$ is normal subgroup which is a subset of the stabilizer, then the $G$ -set may also be regarded as a $G / N$ -set.

To relate this to the theory of covering spaces, there is a universal $G$ -set, namely $G$ as a $G$ -set. Whenever $H$ is a subgroup of $G$ , then the quotient map $G \to G / H$ is the unique pointed $G$ -set map between them.

A construction which will prove useful is a sort of quotient. Suppose $H$ is a normal subgroup of $G$ and $S$ is a $G$ -set. Then ${H s : s \in S}$ is a $G$ -set as well. As an aside: this is related to the induction/restriction adjunction. Restriction of $S$ to an $H$ -set is a functor, and the irreducible $H$ -subsets correspond to the elements of the above quotient. Induction of an $H$ -set $S$ to a $G$ -set is to take $G \times S$ with $(g h, s)$ equivalent to $(g, h s)$ and with the $G$ -action defined by multiplication in the left component.

2. Describing a $G$ -set with a Schreier graph

In this section, we will discuss a way to describe a $G$ -set when $G$ is generated by a set $X$ .

A Schreier graph $Γ$ for a pointed $G$ -set $S$ and generating set $X$ is a labeled directed graph with $S$ the vertex set and a $g$ -labeled edge $s \to g s$ for each $g \in X$ and $s \in S$ .

For a given label $g$ , every vertex has exactly one outgoing edge and one incoming edge with that label. This is because the $g$ acts on $S$ bijectively. For this reason, we can describe a path starting from $s \in S$ using only edge labels and the direction of travel along each edge. For $g \in X$ , $g$ will denote following the edge leaving a particular vertex, and $g^{- 1}$ the following the edge entering that vertex. A path $g_{1} \dots g_{n}$ from a vertex $s$ has each $g_{i}$ or $g_{i}^{- 1}$ in $X$ , and we follow the path in right-to-left order, so that $(g_{1} \dots g_{n}) s$ is the vertex at the end of the path.

A relation in $G$ is a word $g_{1} \dots g_{n}$ equaling the identity, and a relator in $Γ$ at $s \in S$ is a loop starting and ending at $s$ . Every relation corresponds to a relator at each vertex.

The set of relators at a particular vertex, taken as words in $G$ , is the stabilizer of that vertex. This gives a way to see whether a particular graph is the Schreier graph of a $G$ -set with a particular stabilizer: check that each vertex has the correct number of incoming and outgoing edges, check that each relation in $G$ is a relator in the graph, and check that the set of relators at the basepoint equals the stabilizer of the basepoint.

If $Γ$ is a Schreier graph $G$ , and if every element of a normal subgroup $N$ of $G$ is a relator in $Γ$ at $s_{0}$ , then $Γ$ is a Schreier graph of $G / N$ as well.

Suppose $G$ is a free group. The Schreier graph for $G$ as a $G$ -set is a tree, which is a loop-free graph. Given a normal subgroup $N$ of $G$ , the quotient $G$ -set $G / N$ has each element of $N$ form a loop in $G / N$ , and each relator in $G / N$ is an element of $N$ , or in other words a relation in $G / N$ . Suppose $N$ is the normal closure of some set of generators. Each generator is a relator in $G / N$ at every vertex, because conjugation brings a relator from the basepoint to any other vertex. It is sufficient to check each that each generator is a relator at every vertex to see that $N$ is a subset of the stabilizer.

Suppose $H$ is any subgroup of $G / N$ . We may regard it as a subgroup of $G$ instead by finding generators of $H$ . Then, the Schreier graph for $(G / N) / H$ is $G / (N H)$ , so the generators of $H$ are relators at the basepoint.

3. The algorithm

With these facts in mind, the Todd-Coxeter algorithm is fairly simple. Given the following data:

a finite set $X$ of generators,
a finite set $R$ of relations, and
a finite set $Y$ of generators for a subgroup

there is a group

G = ⟨ X ∣ R ⟩

and a subgroup

H = ⟨ Y ⟩

, and we may calculate a Schreier graph for

G / H

by the following process. While we talk about an infinite object, the Schreier graph of a free group, we only ever need to store a portion of a Schreier graph, with the graph outside the frontier assumed to be trees.

Begin with the Schreier graph of the free group generated by $X$ , with vertices labeled by positive integers. Let $s_{0}$ be the basepoint. All of the vertices are unlabeled and unmarked, but $s_{0}$ is labeled with $1$ .
For each generator in $Y$ , follow the path from $s_{0}$ , labeling the unlabeled vertices with fresh integers along the way, and then identify $s_{0}$ with the end vertex.
For each unlabeled unmarked vertex $s$ :
1. Label $s$ with a fresh integer.
2. Follow from $s$ each generator and generator’s inverse, labeling unlabeled neighbors with fresh integers.
3. For each relation in $R$ , follow the path from $s$ , labeling unlabeled vertices with fresh integers along the way, and then identify $s$ with the end vertex.
4. Mark $s$ .

To identify two vertices of a Schreier graph, remove the vertex with the greater label, and move all the edges from the removed vertex to the remaining vertex. Then, wherever the there are too many edges of a particular label around that vertex, identify the vertices at the other end of those edges. Since the set of labeled integers at any particular moment is finite, this identification process must eventually terminate. The result is a Schreier graph of a free group.

Notice the following facts about this algorithm:

At any moment, except while in the process of identifying vertices, the graph is some Schreier graph for the free group generated by $X$ .
Each vertex of the graph is eventually visited.
For any particular vertex, eventually every relation from $R$ is a relator.
Every generator of $H$ is a relator at the basepoint.
Every vertex has a label which eventually stabilizes since they are a decreasing sequence of positive integers.
Every path starting at $s_{0}$ eventually stabilizes in that all of the vertex numbers along the path eventually stabilize.

The last fact means that there is a limiting graph

Γ_{\infty}

, and the other facts imply that

Γ_{\infty}

is a Schreier graph for the free group whose stabilizer contains both

H

and the normal subgroup generated by

R

Thus, the result may be regarded as a $G$ -set. Since the graph is path-connected, it corresponds to an irreducible $G$ -set, and if $[G : H]$ is finite, then the algorithm must terminate: the graph has no more vertices than $[G : H]$ , so there is some point where all of the neighbors of those vertices have stabilized. There is a path between the basepoint and any labeled vertex, so at this point such a path remains within the stabilized vertices.

The question remains whether the result actually is isomorphic to $G / H$ itself. Each step of the algorithm creates a new Schreier graph whose stabilizer contains the stabilizer of the previous Schreier graph, and so there is an increasing chain of subgroups, each of which is inside the limiting $H$ . A relator in $Γ_{\infty}$ is a path, and so there is some step of the algorithm such that that path in that step’s Schreier graph is a loop, and so it is an element of that step’s stabilizer, and hence of $H$ .

That is it for correctness.

Just for the sake of some intuition: the way I think about the algorithm is that we are creating a covering space of the wedge sum of circles, one circle per generator of $G$ . We lazily quotient the covering space by the relations of $G$ and the generators of $H$ to produce a covering space whose fiber is in one-to-one correspondence with $G / H$ .

One thing to notice about the algorithm is that there is no need for $[G : H]$ to be finite, other than to guarantee it will terminate. There is still a limit graph $Γ_{\infty}$ , but unfortunately there is no way to tell when the graph has stabilized!

4. Implementation

Now we give a concrete implementation of the algorithm in Python. For simplicity, we will just compute $S_{3} / ⟨ b ⟩$ with the presentation $⟨ a, b | a^{3}, b^{2}, a b a b ⟩$ . You can download tc.py.

The main data structures are

idents = []
neighbors = []
to_visit = 0

The idents variable is a list of numbers, with the property that idents[i] <= i. This is a union-find data structure for keeping track of the vertex quotients for the Schreier graph so far. The find function looks up the current-lowest label for a particular labeled vertex:

def find(c):
    c2 = idents[c]
    if c == c2:
        return c
    else:
        c2 = find(c2)
        idents[c] = c2
        return c2

For vertices which have not been removed, the neighbors data structure contains a list of all the neighboring vertices, with None for non-existent neighbors — non-existent neighbors are presumed to be a portion of the free group’s universal Schreier graph, namely a tree. Each entry is another vertex label. Vertex labels may be out of date, so find must be run on them to ensure validity.

The to_visit variable keeps track of which vertices have been marked. Any vertex whose label is less than to_visit is considered marked.

We record the relations and generators in the following structures, the details of which are not that important:

ngens = 2
rels = [
    (1, 0), # a^-1a
    (3, 2), # b^-1b
    (0, 0, 0), #a^3
    (2, 2), # b^2
    (0, 2, 0, 2) # abab
]
hgens = [
    (2,), # b
]

We use 0 and 1 for

a

and

a^{- 1}

, and 2 and 3 for

b

and

b^{- 1}

. For simplicity of the core algorithm, we introduce

a^{- 1} a

and

b^{- 1} b

as explicit relations.

The identification procedure is called unify.

def unify(c1, c2):
    c1 = find(c1)
    c2 = find(c2)
    if c1 == c2:
        return
    c1, c2 = min(c1, c2), max(c1, c2)
    idents[c2] = c1
    for d in xrange(2*ngens):
        n1 = neighbors[c1][d]
        n2 = neighbors[c2][d]
        if n1 == None:
            neighbors[c1][d] = n2
        elif n2 != None:
            unify(n1, n2)

It takes two vertices, and makes the one with the greater label refer to the lesser one with idents[c2] = c1. Then, it moves all of the neighbors of the deleted vertex by recursively calling unify. In case the neighbor being replaced is non-existent, which is the base case of the recursion, we just record it as a neighbor. There is no need to record this identification on the other end of the edge because the other end is referring to c2, which is now c1.

For following paths, we have the following two functions:

def follow(c, d):
    c = find(c)
    ns = neighbors[c]
    if ns[d] == None:
        ns[d] = new()
    return find(ns[d])

def followp(c, ds):
    c = find(c)
    for d in reversed(ds):
        c = follow(c, d)
    return c

The first takes a vertex and finds the neighbor in the d direction, creating a new vertex in that direction if needed, and the second follows a list of directions to find the end of a path. The follow function creates new neighbors with the following function:

def new():
    c = len(idents)
    idents.append(c)
    neighbors.append((2*ngens)*[None])
    return c

This just creates a fresh label so that idents[c] == c, and initializes the neighbors array to point to non-existent vertices. There is no need for follow or new to record the neighbors on the new vertex, since the relations such as

a^{- 1} a

will eventually take care of this.

Finally, there is the core algorithm:

start = new()

for hgen in hgens:
    unify(followp(start, hgen), start)

while to_visit < len(idents):
    c = find(to_visit)
    if c == to_visit:
        for rel in rels:
            unify(followp(c, rel), c)
    to_visit += 1

It creates the start vertex, adds all of the relations for

H

as relators at this basepoint, and then for each unmarked vertex, adds all of the relations from rels. Notice how unify is being used to turn paths into relators.

After this, the data structures contain the Schreier graph for $G / H$ . This can be interpreted as a permutation representation, for instance using

cosets = [c for i, c in enumerate(idents) if i == c]

perms = [[cosets.index(follow(c, 2*d)) for i, c in enumerate(cosets)]
         for d in range(ngens)]

to enumerate the cosets (which are vertices which have the least label in their equivalency classes), and then taking edges of the Schreier graph for each generator to construct a permutation.

The permutations can be written in cycle notation fairly easily. One way is with

def cycle(perm):
    parts = []
    for i in range(len(perm)):
        part = [str(i+1)]
        k = perm[i]
        while k != i:
            if k < i: break
            part.append(str(k+1))
            k = perm[k]
        else:
            parts.append(" ".join(part))
    return "("+")(".join(parts)+")"

for d in range(ngens):
    print "g%d ="%d, cycle(perms[d])

For these particular relations, the output is

g0 = (1 2 3)
g1 = (1)(2 3)

which means there are three cosets. The first generator acts cyclically on the three cosets, and the second generator merely swaps the non-subgroup cosets. This actually gives a faithful representation of

S_{3}

Replacing $H$ with the trivial subgroup, we instead get

g0 = (1 2 4)(3 5 6)
g1 = (1 3)(2 6)(4 5)

which means

S_{3}

has six elements.

5. Applications

There are a number of things one can deduce from the result of the algorithm.

Of course, if it terminates, we deduce $[G : H]$ is finite (and know what it equals).
A permutation representation of $G / H$ , given by a permutation of $G / H$ for each generator of $G$ .
Whether $H$ is a normal subgroup. This can be determined by seeing whether each generator’s permutation is an element of $Aut (G / H)$ , since then $Aut (G / H) = G / H$ , implying $H$ is normal.
An algorithm for the word problem for the group. Given two words, follow the graph from the basepoint to see whether they end on the same vertex.

6. References

Artin, Michael. Algebra. — Contains the older-style algorithm of maintaining relation tables.
Brown, Ken. The Todd-Coxeter procedure. http://www.math.cornell.edu/~kbrown/7350/toddcox.pdf — Gives the description of the algorithm from the point of view of Schreier graphs and describes a proof of termination.
Sher and Daverman. Handbook of Geometric Topology — Ran into this while preparing these notes; it was the only reference I found which had a topological way of understanding the algorithm. I was going to describe it more topologically, but I realized $G$ -sets might be clearer for a symbolic explanation.