Combinatorial game theory
This page is about the combinatorial game theory of Hex. For combinatorial game theory in general, see Wikipedia.
Combinatorial game theory is a formalism that can be used to analyze sequential two-player perfect information games. It was developed by Berlekamp, Conway, and Guy starting in the 1960s, and has since been applied to a variety of games, including Hex. One of the central ideas of combinatorial game theory is that games can be split into multiple components (for example, separate regions of the Hex board), and each component analyzed separately. Combinatorial game theory can be used to reason about such concepts as domination, reversibility, inferior moves, and equivalence of positions.
Contents
Concepts
In this section, we introduce some of the main concepts of combinatorial game theory, as it relates to Hex. We do this mostly by example. A more mathematical treatment follows later.
How the game ends
Normally, a game of Hex ends either when a player has connected their edges, or when a player resigns. However, for the game theoretic analysis, it is easier if we imagine that the game continues until the board is completely filled with stones. Notice that after one player has a winning path, filling the rest of board with stones can no longer change who the winner is. So we are free to assume that the game continues until there are no more empty cells on the board. We will make this assumption throughout this article.
Regions
By a region of the Hex board, we just mean a set of one or more cells on the board. The region may include special features such as board edges, and it can optionally be pre-populated with some stones.
Examples.
- The whole board is a region.
- A single empty cell is a region.
- The 3-cell corner is a region.
- The following region is completely surrounded by red and blue stones, with three groups or red stones (and therefore three groups of blue stones) in the boundary. This kind of region is called a 3-terminal region.
Outcomes
By an outcome for a region, we mean one of the possible ways of completely filling it with stones. Since we assume that the game continues until the board is completely filled, the outcome describes what the region may look like at the end of the game. We say that two outcomes for the same region are equivalent if for every way of filling the rest of the board (i.e., the part outside of the region) with stones, the first outcome produces the same winner as the second one. Since equivalent outcomes produce the same winner in all situations, we often consider them to be equal, i.e., when we say "the same outcome", we mean "equivalent outcomes".
Examples.
- When the region is the whole Hex board, there are only two possible outcomes, namely, "Red wins" and "Blue wins".
- When the region is a single empty cell, there are two possible outcomes, namely, "Red gets the cell" and "Blue gets the cell".
- For the 3-cell corner, there are 8 possible ways of filling it with red and blue stones. However, most of these are equivalent, and we end up with only 3 non-equivalent outcomes:
- Outcome 1: Red gets everything. This outcome can be achieved in 3 possible ways; notice that they are equivalent because the blue stone is dead in all cases.
- Outcome 2: Blue gets everything. This outcome can also be achieved in 3 possible ways.
- Outcome 3: Red and Blue each get half of the corner. This outcome can be achieved in 2 possible ways, because the corner cell is dead.
- For a 3-terminal region, there are 5 possible outcomes. Although there are many ways of filling the region with red and blue stones, at the end of the game, all that matters is which terminals are connected to each other. Suppose Red's terminals have been numbered 1, 2, and 3. Then from Red's point of view, the 5 outcomes are:
- Red connects all 3 terminals.
- Red connects only terminals 1 and 2.
- Red connects only terminals 1 and 3.
- Red connects only terminals 2 and 3.
- Red connects none of her terminals.
- Red connects all 3 terminals.
Positions and options
Consider a region of the Hex board. By a position in the region, we mean the state of the region after zero or more moves have been played, or in other words, a version of the region in which zero or more of its emtpy hexes have been filled with stones (of any color).
Players change positions by making moves. In a given position, we say that Red's options are all positions that Red can reach by making a single move in the region, and Blue's options are all the positions that Blue can reach by making a single move in the region.
Example. Consider the position
G =In this position, Red has two options:
G₁ = and G₂ =Also, Blue has two options:
H₁ = and H₂ =Combinatorial game notation
In combinatorial game theory, the players are called Left and Right. In Hex, Red is Left and Blue is Right. We therefore call Red's options left options and Blue's options right options.
In combinatorial game theory, the word game is used to mean position. Thus, an unfinished game has left and right options that are themselves games.
The notation for a combinatorial game is G = {G₁, …, Gₙ | H₁, ..., Hₘ}, where G₁, …, Gₙ are the left options and H₁, ..., Hₘ are the right options. Note that the game is enclosed in curly braces, and a vertical bar is used to separate the left options from the right options.
Example. Consider the position from the previous example:
G =As we saw, G has two left options G₁ and G₂ and two right options H₁ and H₂. We can write
Recursively, each of the positions G₁, G₂, H₁, H₂ is also a combinatorial game with a left and right option. For example,
By recursively decomposing all options, we arrive at this representation of the game G:
When a position has no empty cells, it is called an atomic position. Atomic positions have no left or right options and can cannot be further decomposed. Instead, we can replace each atomic position by the associated outcome. For example, the position G in this example is a 3-terminal position. Let us write "⊤" (pronounced "top") for the outcome "Red connects all three terminals", "⊥" (pronounced "bottom") for the outcome "Red connects no terminals", and "a" for the outcome "Red connects the right two terminals, but not the left one". With these notations, the combinatorial game G can be more succinctly described as:
G = {{⊤|⊤}, {⊤|a} | {a|⊥}, {⊤|⊥}}.
Remark. In a combinatorial game, the players do not always alternate. For example, in the game G, Red can move to the left option G₂ = {⊤|a}. Then Red can move again to the left option ⊤. The reason that players are not necessarily alternating is that the combinatorial game is meant to represent play in a region. For example, Red might move in the region, Blue might move somewhere outside the region, and then Red might move in the region again. Thus, although the players were alternating in the game at large, within the region, Red has made two consecutive moves.
Simplification of combinatorial games
The combinatorial game notation we introduced in the previous section is not very compact. In fact, it is basically a representation of the entire game tree, i.e., a description of all moves that both players could make in every possible position of the game. For a position with more than a few empty cells, the game tree is much too large to write down in its entirety.
Fortunately, combinatorial games can be simplified. Much like the rules by which school children learn to simplify (5 + 8) * 2 to 13 * 2 and then to 26, there is a specific set of rules for simplifying combinatorial games and calculating their value. These rules are a bit complicated and we will describe them in more detail in the mathematical section below. However, the main point is that the rules are mechanical and universal: every finite combinatorial game can be simplified to a unique value, and this can be done either by hand or by using a "CGT calculator". Moreover, when a Hex game has been decomposed into several disjoint regions, we can in principle calculate and simplify the value of each region independently, and then combine them to determine the value, and therefore the winner, of the entire game. In practice, this calculation is too hard when there is lots of empty space on the board, but can be feasible for certain end game positions.
Remark. Unlike in commerce, where the value of everything can be expressed in terms of money, the value of a Hex position is not simply a number. Rather, it is a specially simplified combinatorial game that is a concise description of what each player can get from the position. When two positions have the same value, then they are equivalent, even if they look very different on the Hex board. The following example shows the value of a Hex position, and also explains what that value means in English.
Example. Once again, we consider the position from the previous example:
G =As we saw, the combinatorial notation for this position is
- G = {{⊤|⊤}, {⊤|a} | {a|⊥}, {⊤|⊥}}.
Using the simplification rules we will discuss in more detail later, this can be simplified (or "evaluated") to
- G = {⊤ | {a | ⊥}}.
The English explanation of this value is something every Hex player will agree with: If Red moves first in the region, Red moves at x and connects all three terminals. After this, the region is settled, i.e., neither Blue nor Red wants to make another move there. If Blue moves first in the region, Blue moves at x and threatens to connect all of Blue's terminals, but Red can reply at y to achieve outcome a. In this latter case, Red connects terminals 2 and 3, but not terminal 1.
Comparison games
Comparing positions
When we consider two different Hex positions in the same region, we often ask questions like: are the positions equivalent? Is one of them better for Red than the other? We can formalize this notion by defining an order on positions.
Given two positions P and Q in the same Hex region, we say that P ≤ Q if Q is at least as good for Red as P. The symbol "≤" is pronounced "less than or equal", and is called an order relation. Note that the order is always defined from Red's point of view, i.e., by "better", we mean "better for Red". But of course Blue has the opposite point of view: from Blue's point of view, P ≤ Q means that P is at least as good for Blue as Q.
Remark. The ordering of positions is not total: one can have positions P and Q such that neither P ≤ Q nor Q ≤ P holds. For example, P could be better for Red in some situations, and better for Blue in others. Such positions are called incomparable, and we say that positions are partially ordered.
So far, we have not been very precise about what we mean by "at least as good". One way to define this is by considering the region to be part of a larger game, and then check to see who is winning that game. Specifically, P ≤ Q means that for every way of embedding the region inside a larger game, and no matter whose turn it is, if P is winning for Red, then Q is also winning for Red.
The comparison game
In practice, we do not have time to think about all of the infinitely many ways that the rest of the game could be affecting our region. It turns out that there is a much simpler way to prove that P ≤ Q, which only requires us to think about P and Q, and not about the rest of the board. This is done by playing the so-called comparison game, which we now describe.
Let P and Q be two positions in the same Hex region. The comparison game is a game played by two players, who are called Truthifier and Falsifier. Truthifier's goal is to prove that P ≤ Q and Falsifier's goal is to prove that P ≰ Q. Therefore, to prove P ≤ Q, we must describe a winning strategy for Truthifier.
The game is played as follows. The positions P and Q are set up next to each other. The players alternate, with Falsifier going first. A move by Falsifier consists of either placing a red stone in P or placing a blue stone in Q. A move by Truthifier consists of either placing a blue stone in P or placing a red stone in Q. Thus, whenever Falsifier plays in one component (P or Q), Truthifier has the option to either respond in the same component (using the opposite color) or in the opposite component (using the same color). The game continues until both P and Q are completely filled with stones, i.e., there are no more empty cells. At this point, it is easy to check whether P ≤ Q, because it amounts to checking that for every red path through P, there is a corresponding red path through Q. If this is the case at the end of the game, Truthifier wins; otherwise, Falsifier wins.
Remark. The above description of the comparison game works for Hex. For some other classes of combinatorial games, the description might be slightly more complicated, depending on such details as if and when passing is allowed. We will get back to this point in the mathematical section below.
Example. The page theorems about templates claims that B is at least as good for Red as A.
A: B:We now show how to prove A ≤ B using the comparison game. We must describe a winning strategy for Truthifier. Falsifier makes the first move. If Falsifier plays a red stone at x, Truthifier responds with a red stone at y. Since this kills the blue stone, the outcome of B is better for Red than that of A, so Truthifier wins. On the other hand, if Falsifier plays a blue stone at y, Truthifier responds with a blue stone at x. Since B has a red stone and A doesn't, B again has a better outcome than A, so Truthifier wins. Since Truthifier wins the comparison game no matter how Falsifier plays, we have proved A ≤ B.
Note that in this example, Truthifier always responds to Falsifier's move by playing in the opposite component. There are no other choices, since each component only has a single empty cell. Let us now consider a more complicated example.
Example. Here is another claim from theorems about templates. We claim that A ≤ C, where
A: C:To prove it, we consider every possible move by Falsifier.
- If Falsifier plays Red w, Truthifier responds with Red y, killing x, and leaving A ≤ C (since C has an empty cell where A has a blue stone).
- If Falsifier plays Blue x, Truthifier responds with Red y, killing x and leaving A ≤ C.
- If Falsifier plays Blue z, Truthifier responds with Red x, leaving the two regions equal.
- If Falsifier plays Blue y, Truthifier responds with Red x. Now if Falsifier plays Blue z, Truthifier responds with Blue w, leaving the two regions equal. If Falsifier instead plays Red w, Truthifier responds with Red z, capturing y and leaving C better than A.
Remark. In the last example, we did not always play the game to the end. Indeed, as soon as Truthifier reaches a position in which it is clear that A ≤ C, we can stop the comparison game, as we already know that Truthifier has a winning strategy from that point onwards.
Mathematical details
To be written.
References
- J. H. Conway, "On Numbers and Games". 1st edition 1976. 2nd edition, A. K. Peters, 2001.
- P. Selinger, "On the combinatorial value of Hex positions", Integers 22:G3, 2022. Also available from arxiv:2101.06694