论文信息 - Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Julien Perolat*,1,‡, Bart de Vylder∗,1,‡, Daniel Hennes1, Eugene Tarassov1, Florian Strub1, Vincent de Boer†1, Paul Muller1, Jerome T. Connor1, Neil Burch1, Thomas Anthony1, Stephen McAleer1, Romuald Elie1, Sarah H. Cen1, Zhe Wang1, Audrunas Gruslys1, Aleksandra Malysheva1, Mina Khan1, Sherjil Ozair1, Finbarr Timbers1, Toby Pohlen1, Tom Eccles1, Mark Rowland1, Marc Lanctot1, Jean-Baptiste Lespiau1, Bilal Piot1, Shayegan Omidshafiei1, Edward Lockhart1, Laurent Sifre1, Nathalie Beauguerlange1, Remi Munos1, David Silver1, Satinder Singh1, Demis Hassabis1, and Karl Tuyls∗,1,‡

[1] Matteo Hessel,et al. Podracer architectures for scalable Reinforcement Learning , 2021, ArXiv.

[2] Richard G. Gibson. Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents , 2014 .

[3] R. Howe,et al. 17th International Conference on Medical Image Computing and Computer-Assisted Intervention. , 2014, Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention.

[4] Michael Johanson,et al. Measuring the Size of Large No-Limit Poker Games , 2013, ArXiv.

[5] Geoffrey E. Hinton,et al. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition-' Washington , D . C . , June , 1983 OPTIMAL PERCEPTUAL INFERENCE , 2011 .

[6] Léon J. M. Rothkrantz,et al. Invincible - A Stratego Bot , 2008, Int. J. Intell. Games Simul..

[7] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[8] J M Smith,et al. Evolution and the theory of games , 1976 .

[9] E. Rowland. Theory of Games and Economic Behavior , 1946, Nature.