Learning to play Diplomacy is a big deal for several reasons. Not only does it involve multiple players, who make moves at the same time, but each turn is preceded by a brief negotiation in which players chat in pairs in an attempt to form alliances or gang up on rivals. After this round of negotiation, players then decide what pieces to move—and whether to honor or renege on a deal.

At each point in the game, Cicero models how the other players are likely to act based on the state of the board and its previous conversations with them. It then figures out how players can work together for mutual benefit and generates messages designed to achieve those aims.

To build Cicero, Meta marries two different types of AI: a reinforcement learning model that figures out what moves to make, and a large language model that negotiates with other players.

Cicero isn’t perfect. It still sent messages that contained errors, sometimes contradicting its own plans or making strategic blunders. But Meta claims that humans often chose to collaborate with it over other players. 

And it’s significant because while games like chess or Go end with a winner and a loser, real-world problems typically do not have such straightforward resolutions. Finding trade-offs and workarounds is often more valuable than winning. Meta claims that Cicero is a step toward AI that can help with a range of complex problems requiring compromise, from planning routes around busy traffic to negotiating contracts.

Leave a Reply