For fans of the classic board game Stratego, researchers at DeepMind may have created your nemesis. DeepNash, the latest offering from the well-known research group, is a new AI that has mastered the game of Stratego. Making the announcement on Twitter, they make the claim that the DeepMind Stratego AI learned by playing against itself over five billion times. As players of the board game know, just mastering it is an impressive feat. According to DeepMind, the model used a reinforcement learning algorithm and has thus far beaten almost every human player it has encountered.
Introducing DeepNash – the first AI to master Stratego, a game of hidden information which is more complex than chess, Go and poker.
Published in @ScienceMagazine, it uses a novel model-free reinforcement learning algorithm.
— DeepMind (@DeepMind) December 1, 2022
For those unfamiliar with the game, it requires two players to set up a strategy to capture your opponent’s flag using confidential information, and like poker, bluffing. The confidential information comes from the fact that the players cannot see each other’s piece rank or arrangement (at the beginning of the game). What makes this an interesting learning exercise for the AI, is that human players must visualize every possible outcome when setting up the board, then make their turns with that limited scope of information. The AI, just like any player, was also limited to this. So it’s a great learning tool for AI learning from incomplete information. Using forty pieces, each player has an army of pieces that represent soldiers. But capturing the flag isn’t the only way to come out victorious. One can also halt the enemy’s advance; if they cannot move then the game is over for that player as well.
This isn’t the first game that AI has been tasked with mastering. From Chess to Go, AI systems have, over the years, learned the basics of strategy and applied it. But unlike Chess or Go, Stratego is arguably more complex. It’s due to not only the lack of complete information, but, as The Byte states, Stratego has 10 to the power of 535 possible game states. For some reference, Go has a 10 to the power of 360 possible game states. That’s an impressive level of skill that the AI has achieved. Though DeepNash has mastered a board game, it’s more important than just winning games. By teaching AI how to learn without complete information, it’s another step forward in the development of the technology by teaching it how to make judgments without the full scope of information.
One thing that came up in this study was DeepNash’s ability to develop an “unpredictable strategy.” By doing so it kept its human players trying to guess what it was doing, which in a game such as Stratego, is quite important. Finally, DeepNash has become a kind of a figure within the Stratego world. According to DeepMind, the AI has become so good at playing the game, it has achieved a top-three ranking. This means DeepNash is on the top ranking on the Stratego platform Gravon. Out of fifty games, it won a remarkable 84% win rate.