AI for "Button Men"

I wanted to attempt to write an Artificial Intelligence for a game that involved chance. I eventually stumbled across a game called Button Men which uses dice as a mechanic to simulate a battle between two brawlers. I implemented two different A.I. Techniques to play this game and ran them against each other to see which was better.

Image from Button Men website — Image from Button Men website

Button Men is a game published and made freely available by Cheapass Games. They publish the full rules on their website here [PDF] but I will provide a summary of them. The goal of the game is to win three rounds, rounds are won by capturing the opponents dice. At the start of each game, players select a “fighter” who determines the dice they receive. All dice have between 4 and 20 sides. Each player also receives an additional “swing die”, a die that has a number of sides of their choosing, but constrained to between 4 and 20 sides.

A round begins with each player rolling all their dice. The player with the lowest roll goes first. They can then make one of two moves, a skill attack or a power attack, defined as:

Power Attack: Use one of your dice to capture one of your opponent's dice. The number showing on your die must be greater than or equal to the number showing on the die you capture.
Skill Attack: Use several of your dice to capture one of your opponent's dice. In this attack, your
dice must add up exactly to the value showing on the die you capture.

For each die captured, the capturing player is rewarded points based on the number of faces the captured die had. The capturing dice are re-rolled and play switches to the opponent. If a player can not make a legal move they must pass. Play continues until both players pass, at which point they are
rewarded half points for their remaining dice. At this point, the round is over and if neither player has yet won three rounds, a new round begins allowing players to re-select their swing die.

As mentioned earlier, each player selects a fighter who determines the dice they get. The fighters are defined as follows, where numbers denote the number of faces each die has, and X represents the
swing die:

I developed two different algorithms for playing this game, one a Monte Carlo style algorithm that relies on random game playing, and the other an implementation of the Expectimax algorithm. Both these algorithms are used to determine move choice and swing die.

The Monte Carlo algorithm is very simple, for any given decision, for each possible choice it plays 50 random games and returns the choice that has the most number of wins. When selecting a move, it uses a list of potential moves at the current state as starting points and has both players make random moves from then til the game is finished. When selecting a swing die it simulates games adding 4 through 20 sided die and playing games where each player selects the move with the highest point conversion. This choice was made as it created consistent results.

Expectimax is an extension of the Mini-Max algorithm that introduces intermediary expectation nodes to deal with the random events. For a given move, each die that will be re-rolled, or each combination of dice in the case of skill attack, generates an expectation node for each die face that could potentially be rolled afterwards. When returning the heuristic score expectation nodes return the average score from all their children. In selecting a swing die, the AI uses the best score from all moves available with a given die to compare to scores attained the same way with other dice.
When pitted against each other, the Monte Carlo algorithm typically out performs the Expectimax algorithm. The following is data attained by playing a game for each fighter using one algorithm against every fighter using the other algorithm. This was done three times:

Monte Carlo algorithm won 79 games, 54.86% of games
Expectimax algorithm won 65 games, 45.14% of games
Monte Carlo algorithm won 196 rounds, 55.84% of rounds
Expectimax algorithm won 155 rounds, 44.16% of rounds
Avis wins 6 with MC and 5 with EM.
Bauer wins 9 with MC and 6 with EM.
Kith wins 7 with MC and 4 with EM.
Karl wins 8 with MC and 4 with EM.
Niles wins 3 with MC and 4 with EM.
Hannah wins 5 with MC and 5 with EM.
Hammer wins 7 with MC and 8 with EM.
Stark wins 4 with MC and 2 with EM.
Clare wins 6 with MC and 7 with EM.
Iago wins 6 with MC and 5 with EM.
Shore wins 7 with MC and 10 with EM.
Kublai wins 11 with MC and 5 with EM.

Monte Carlo algorithm won 94 games, 65.28% of games
Expectimax algorithm won 50 games, 34.72% of games
Monte Carlo algorithm won 210 rounds, 61.05% of rounds
Expectimax algorithm won 134 rounds, 38.95% of rounds
Avis wins 7 with MC and 3 with EM.
Bauer wins 8 with MC and 6 with EM.
Kith wins 8 with MC and 1 with EM.
Karl wins 9 with MC and 4 with EM.
Niles wins 9 with MC and 5 with EM.
Hannah wins 8 with MC and 7 with EM.
Hammer wins 9 with MC and 2 with EM.
Stark wins 8 with MC and 3 with EM.
Clare wins 7 with MC and 6 with EM.
Iago wins 3 with MC and 1 with EM.
Shore wins 9 with MC and 7 with EM.
Kublai wins 9 with MC and 5 with EM.

Monte Carlo algorithm won 81 games, 56.25% of games
Expectimax algorithm won 63 games, 43.75% of games
Monte Carlo algorithm won 189 rounds, 54.62% of rounds
Expectimax algorithm won 157 rounds, 45.38% of rounds
Avis wins 5 with MC and 5 with EM.
Bauer wins 8 with MC and 4 with EM.
Kith wins 6 with MC and 3 with EM.
Karl wins 5 with MC and 6 with EM.
Niles wins 7 with MC and 3 with EM.
Hannah wins 8 with MC and 6 with EM.
Hammer wins 6 with MC and 6 with EM.
Stark wins 4 with MC and 6 with EM.
Clare wins 9 with MC and 8 with EM.
Iago wins 5 with MC and 3 with EM.
Shore wins 11 with MC and 8 with EM.
Kublai wins 7 with MC and 5 with EM.

This data shows there is about a 60/40 split for success when comparing the Monte Carlo and Expectimax algorithms.

In producing the data above, it does need to be noted that both players chose their Swing die with the Monte Carlo algorithm, and the Expectimax does not do skill attacks involving more than two dice. These decisions were made for the stake of stability. In the game, it is possible to make a move with two (or more) dice, if one were 4 sided and the other 20 sided, this move would spawn 80 children expectation nodes. As such, on occasion the Expectimax algorithm would generate a tree so large an “out of memory” error would occur. The Expectimax algorithm is depth limited, and the code allows it to expand its depth as the number of dice in play decreases which decreases alongside the number of available moves. For most of the game though, it is restrained to looking one move ahead. Though data is not as complete as above, in experience allowing swing die to be chosen with expectimax produces similar results as those chosen with Monte Carlo and played with expectimax.

Monte Carlo was clearly the better algorithm. From start to finish most games will take 9 turns. Simulating a game can be done very quickly because of this. Expectimax looks ahead very systematically, where Monte Carlo looks ahead randomly, but can do so much deeper than expectimax, for this game and its high randomness, the deeper look bests the systematic look.

When playing against the AI (I choose to play against the Monte Carlo Algorithm) the AI feels competent, and on a level that matches my own. It seems to fall prey to attacks where I can force it to not have a turn or make poor moves, and rarely uses that strategy against myself, although opportunities to do so are rare.

In a game as random as this, I feel my developed algorithms do a good job at intelligent play. It can be hard to judge intelligence over luck in this game, but when I simulated the complete set of possible games based on the set of fighters, I believe this showed that the Monte Carlo algorithm is superior. Future versions of the code could include a pruned and more efficient Expectimax and a more intelligent Monte Carlo algorithm.