论文信息 - Deep Fictitious Play for Games with Continuous Action Spaces

Deep Fictitious Play for Games with Continuous Action Spaces

Fictitious play has been a classic algorithm to solve two-player adversarial games with discrete action spaces. In this work we develop an approximate extension of fictitious play to two-player games with high-dimensional continuous action spaces. We use generative neural networks to approximate players' best responses while also learning a differentiable approximate model to the players' rewards given their actions. Both these networks are trained jointly with gradient-based optimization to emulate fictitious play. We explore our approach in zero-sum games, non zero-sum games and security game domains.

[1] Mohammad Taghi Hajiaghayi,et al. A Polynomial Time Algorithm for Spatio-Temporal Security Games , 2017, EC.

[2] Milind Tambe,et al. Patrol Strategies to Maximize Pristine Forest Area , 2012, AAAI.

[3] Milind Tambe,et al. "A Game of Thrones": When Human Behavior Models Compete in Repeated Stackelberg Security Games , 2015, AAMAS.

[4] V. Conitzer,et al. Approximation guarantees for fictitious play , 2009, 2009 47th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[5] Rong Yang,et al. Adaptive resource allocation for wildlife protection against illegal poachers , 2014, AAMAS.

[6] Vijay Krishna,et al. On the Convergence of Fictitious Play , 1998, Math. Oper. Res..

[7] Joelle Pineau,et al. TarMAC: Targeted Multi-Agent Communication , 2018, ICML.

[8] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.

[9] Milind Tambe,et al. Robust Protection of Fisheries with COmPASS , 2014, AAAI.

[10] Sheng Zhong,et al. On repeated stackelberg security game with the cooperative human behavior model for wildlife protection , 2018, Applied Intelligence.

[11] Yan Liu,et al. Policy Learning for Continuous Space Security Games Using Neural Networks , 2018, AAAI.

[12] David S. Leslie,et al. Stochastic fictitious play with continuous action sets , 2014, J. Econ. Theory.

[13] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[14] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.

[15] William H. Sandholm,et al. ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[16] Bo An,et al. Game-Theoretic Resource Allocation for Protecting Large Public Events , 2014, AAAI.

[17] Milind Tambe,et al. Handling Continuous Space Security Games with Neural Networks , 2017 .

[18] Jeff S. Shamma,et al. Unified convergence proofs of continuous-time fictitious play , 2004, IEEE Transactions on Automatic Control.

[19] Milind Tambe,et al. Optimal patrol strategy for protecting moving targets with multiple mobile resources , 2013, AAMAS.

[20] Yoshua Bengio,et al. Deep Directed Generative Models with Energy-Based Probability Estimation , 2016, ArXiv.

[21] Mohammad Taghi Hajiaghayi,et al. Spatio-Temporal Games Beyond One Dimension , 2018, EC.

[22] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.