Pickomino (known as *Regenwormen* in Dutch, *Heckmeck* in German) is a dice game in which players try to roll high dice scores to collect tiles. It got me interested in probability, dynamic programming, and React. I built an online game with AI players, allowing human players to improve their game and perform in-depth game analyses.

Play Pickomino online in Chinese, Dutch, English, French, German, Japanese, or Spanish.

In this post, I explain how the AI was built, which will allow for in-depth game analyses.

## Key takeaways

- The probabilities of a turn can be calculated recursively.
- Play against an AI player that uses these probabilities and can strategize to win the game from humans.
- Improve your play by viewing the considerations that the AI makes.

Pickomino is a turn-based dice game played with two to seven players. It is played with eight dice and sixteen tiles. The dice are ordinary, but the six face is replaced with a the picture of a worm: . The tiles have values from 21 to 36 and are worth 1-4 worms. The game’s objective is to collect tiles with the largest sum of worms. Each turn a player can collect at most one tile by rolling dice.

The turn starts with one player rolling all eight dice. They can then collect each dice of one face. The player can decide to roll the remaining again or to pick a tile off the table with a value lower or equal to the sum of the collected dice. The worm-faced die face is a caveat: They are worth 5 points, and the player must have collected at least one of these to pick up any tile. In the consequent rolls, die faces can only be collected that have not been collected before. If there are no valid dice to pick up, the player is busted and must return the last tile the player has picked up.

One more complexity is that if the player has a dice collection *exactly* equal to the last tile another player has collected, a player can steal this tile.

Thomas ten Cate has previously written about the game in a blog post appropriately titled How to win at the game pickomino. If you are unfamiliar with the game, read up on it over there.

During a turn, the player tries to maximize the value of the dice they collect. This is part luck and part the result of the two decisions below. To play the best turn, it is essential to estimate these decisions’ consequences accurately. Most human players cannot accurately estimate these probabilities. To assist human players, I present a probability table below.

**Quit or roll?**First, the state of the dice collected can be represented by the frequency of each die face collected. Three fours and two worms would be $(0, 0, 0, 3, 0, 2)$. Should the player pick up any tile smaller or equal to 22 and finish their turn or decide to risk another roll?**Which face to collect?**The player decides to roll, which yields two twos and a three: $(0, 2, 1, 0, 0, 0)$. Collecting the twos would give a higher score, but collecting the threes leaves two dice to roll a five. What is more advantageous?

The table below the game gives the probability of reaching a tile value (21 to 36). Enter the dice you have **Coll**ected to see your odds when going for another turn. Enter the dice you **Roll**ed to see the chances for each die face you can collect. Similarly, we can calculate the probability of being able to steal a tile.

The probability table stores the probabilities of all possible game states. Storing all possible states in complex games like chess and Go is not feasible. Rolling dice in Pickomino, however, has relatively low complexity. With eight six-faced dice, there are only 3003 unique dice collection states; see the formula below. The probabilities of all these 3003 states can be stored in a 347kb `.json`

(or 15kb `.json.zip`

) file. This is the file that your browser uses to look up the probabilities in the table above.

$\lvert \text{ Dice combinations } \rvert = \sum_{r=0}^{8} \frac{{(r + 6 - 1)!}}{{r! \times (6 - 1)!}} = 3003$

Computing all probabilities shown above is a recursive task. This means we can calculate the likelihood of early-game states by working out all possible following-game states. I have implemented this in Python, which only takes a few seconds. The source code is available on GitHub. Below is the pseudocode of the operation:

pseudocode

```
Dice is ( int, int, int, int, int, int )
Probabilities are ( 21: float, 22: float, ..., 36: float )
storage.json is ( Dice: Probabilities, ...)
function Get_state_p( Dice ) -> Probabilities:
Probabilities[ <= Dice.score ] = 100%
Probabilities[ > Dice.score ] = Get_roll_p( Dice )
storage.json[Dice: Probabilities]
return Probabilities
function Get_roll_p( Dice ) -> Probabilities:
for roll_outcome in Roll( Dice.free_dice ) do
for pickup in roll_outcome[ DiceState == 0 ] do
roll_probabilities += max( Get_state_p( Dice + pickup ) )
return mean( roll_probabilities )
Get_state_p( ( 0, 0, 0, 0, 0, 0 ) )
```

Interestingly, to build the probability table of being able to steal a tile, all I had to do was swap the `<=`

operator for an `==`

operator. The two functions are invoked in a cycle, working their way through all game states and storing their results in the `container`

in the meantime:

I validated these results by simulating turns. If the difference between our calculated and simulated win-chance is small the calculations are implemented correctly. I built a function that simulates the rolling of dice. For each roll, it picks the die face with the best probability for a tile value to match the calculated win rate. I let it play 100.000 turns for each tile value. The mean difference between the calculated and simulated win rate was 0.06% [-0.12%, 0.25%]. The difference fell within the 99% confidence interval for all tile values. We can conclude that the probabilities from the calculator can be achieved.

The tables above help us estimate the risks, but to play the game, we also need to define the value of collecting certain tiles. Unlike our previous challenge, we cannot analyze all game states because with four players and sixteen tiles, $6×^{16}=2.8 \text{ trillion}$ possible game states. Instead, we can calculate a reward for collecting tiles. The reward consists of tile value and whether we are stealing it or picking it up. Another factor we need to consider is the potential loss of our last collected tile.

$\text{Appeal = Probability × ( Value × 2 if Stolen, 1 otherwise + Potential Loss )}$

We can use this appeal term both to select dice and decide if we want to end our turn. For example, when you have collected four worms and a four $(0, 0, 0, 1, 0, 4)$, the appeal of ending the turn and picking up a tile with value 1 is $1.00$, but that of continuing the turn is $1.92$. Second, we need to take into account the risks.

I have tested two alternative Steal-multiplier approaches, but they were not better. First, I considered only adding this multiplier when stealing from the leading player or leading player, based on the reasoning that the game’s goal is to win and not own losers. Second, I considered making the multiplier dependent on the number of players in the game because when there is much competition, stealing is not an effective way to fight the competition. This last method clearly fared worse. My conclusion is that since games can last a long time and anything can happen, I think it is advantageous to suppress all competition anyway.

I built an AI player in Python. It is not optimized for speed but can play a full game in about $62ms$. I am currently porting it to React, so stay put!

The development of AI players capable of competing at a human level, playing up to sixteen times per second, opens the door to fascinating game analyses.

The table below illustrates the average number of turns per game played by up to six AI players. When the stealing mechanism is disabled, the average number of turns remains consistent at 25, regardless of the number of players. However, when stealing is enabled, the average duration scales linearly with the number of players. By effectively leveraging the stealing mechanism, the AI can nearly double the total number of turns in a game.

In the early stages of the game, a plethora of tiles are available for players to collect. As the game advances, higher-value tiles remain on the table, reducing the likelihood of successful pick-ups. This dynamic not only increases the incentive to steal tiles but also raises the rate of loss. These trends are visually represented in the stacked area graph below, which depicts the mean turn outcomes across 10,000 games. The spike observed at the final turn is a natural artifact, as every game concludes with a player picking up a tile. The horizontal axis indicates the number of turns, scaled to reflect the average game length, while the vertical axis represents the proportion of turns that result in either picking up a tile from the table, losing a turn, returning a tile if the player has one, or stealing a tile from another player.

The player that starts has the first choice over tiles on the table. Is beginning the game an advantage, and if so, how big of an advantage?

I tested the alternative hypothesis that no first-play (dis)advantage exists. Under this hypothesis, the hypothesized win rate is $\frac{1}{n}$ with $n$ the number of players $2 \leq n \leq 7$ (ages eight and up). I simulated 10.000 games for each of these parties. I did so by enabling and disabling stealing because we have seen that it affects game duration. After discarding tied games, I used a one-tailed binomial test to determine if the observed win rate deviates significantly from our hypothesized win rate.

We can confidently reject the hypothesis that the first player’s win rate meets our hypothesis. Although the first-play effects are not drastic, you can use this table as an argument if you happen to be on a loss streak on any Sunday afternoon.

Players | Hypothesized win rate | Win rate without stealing | p-value without stealing | Win rate with stealing | p-value with stealing |
---|---|---|---|---|---|

2 | 50.00% | 53.38% | 2.3 × 10^{-11} |
51.93% | 8.4 × 10^{-5} |

3 | 33.33% | 37.65% | 2.1 × 10^{-18} |
35.68% | 1.0 × 10^{-6} |

4 | 25.00% | 28.79% | 2.8 × 10^{-16} |
25.93% | 2.1 × 10^{-2} |

5 | 20.00% | 23.92% | 4.2 × 10^{-19} |
21.82% | 1.1 × 10^{-5} |

6 | 16.67% | 21.43% | 7.8 × 10^{-30} |
17.64% | 8.2 × 10^{-3} |

7 | 14.29% | 17.93% | 3.8 × 10^{-20} |
15.69% | 1.3 × 10^{-4} |

Pickomino is a board game highly governed by chance. The game’s complexity during a turn is low, allowing any computer to map all states to outcome probabilities easily. The game at large, however, is highly complex and cannot be fully mapped. Furthermore, to use the probability information during a turn requires game strategic decisions about risk and reward. This means that human players could strategize to beat AI players.

Human players are generally disadvantaged in calculating probabilities and tracking competitor information compared to AI players. I present insight into these probabilities such that players can focus on optimizing the overall strategy of the game,

In-depth game analyses show that Pickomino has a significant first-player advantage. Players less experienced in using the stealing mechanism will play shorter games, increasing this first-player advantage.