Deepmind beats human players at Stratego
Winning at the board game Stratego means relying on incomplete information. For the first time, an AI, DeepNash, has now succeeded.
Contents
Hard float for Deepmind
What makes Stratego a tough nut to crack for an AI are two things. First, that’s the incomplete information. After all, at the beginning of the game you only know where your own pieces are. Second, a game of Stratego takes a very long time, often hundreds of moves. All this together means that the strategies that have been used so far to win at poker and chess and go, for example, do not work at Stratego.
Brute force doesn’t work either. The playing field is a lot smaller than with go (10 X 10 instead of 19 X 19). Because there are so many different types of pieces that can be placed randomly, there are more possibilities (a 535-digit number) than all the particles in the universe (an 80-digit number).
How did Deepmind manage to crack Stratego?
The DeepMind team developed the AI DeepNash for this purpose. This strives for a Nash equilibrium, making this strategy unbreakable. DeepNash taught himself Stratego by playing countless games against himself. This made DeepNash very good at Stratego: on the online board game forum Gravon, the artificial intelligence took third place in the ranking of the best players ever. On average DeepNash won 84% of the games against the best human Stratego players in the world.
More than just a game
This achievement taught the DeepMind team two important skills. They managed to get an AI to make the best decisions based on incomplete information, a typical human skill. Second, how to make an AI weigh different outcomes to find the best one.
Of course, Stratego is a simple environment compared to the real world. But in future AI projects, these successes bring it within reach. For those who are not afraid of a bit of jargon, you can read the article in Nature here.
More about Strategy
Stratego is a well-known board game that has existed in its current form since the end of World War II. For those who are not yet familiar with the game: on a board with 10 x 10 fields, each player can place his 40 game pieces in any order he wants. As long as he does that in the back four rows. The goal is to capture the enemy banner, one of the 40 pieces.
You cannot see in which way the opponent has placed his pieces. The pieces have ranks, where the strongest piece, the marshal, can beat any other piece.
The General, the second strongest piece, beats all pieces except the Marshal, and so on. If two pieces of equal strength meet, it becomes mutual suicide. The weakest piece, the spy, is the only piece capable of defeating the marshal. Then there are bombs, which cannot move but which defeat every other piece except the minor. So even if the enemy spy is dead, you still can’t unleash unopposed slaughter with the Marshal.
You can therefore not be sure that any piece is completely safe. Being able to bluff well is an essential skill to win a game. That makes Stratego a popular and exciting game.