Learning While Playing in Mean-Field Games: Convergence and Optimality
暂无分享,去创建一个
Qiaomin Xie | Zhuoran Yang | Zhaoran Wang | Andreea Minca | Andreea Minca | Zhuoran Yang | Zhaoran Wang | Qiaomin Xie
[1] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[2] Romuald Elie,et al. Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications , 2020, NeurIPS.
[3] Hamidou Tembine,et al. Mean field difference games: McKean-Vlasov dynamics , 2011, IEEE Conference on Decision and Control and European Control Conference.
[4] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[5] P. Lions,et al. Jeux à champ moyen. I – Le cas stationnaire , 2006 .
[6] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[7] Bernhard Schölkopf,et al. A Kernel Method for the Two-Sample-Problem , 2006, NIPS.
[8] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[9] Sean P. Meyn,et al. Learning in Mean-Field Games , 2014, IEEE Transactions on Automatic Control.
[10] Aditya Mahajan,et al. Reinforcement Learning in Stationary Mean-field Games , 2019, AAMAS.
[11] Naci Saldi,et al. Value Iteration Algorithm for Mean-field Games , 2019, ArXiv.
[12] Tamer Basar,et al. Discrete-time LQG mean field games with unreliable communication , 2014, 53rd IEEE Conference on Decision and Control.
[13] Olivier Guéant,et al. Mean Field Games and Applications , 2011 .
[14] Shie Mannor,et al. Regularized Policy Iteration with Nonparametric Function Spaces , 2016, J. Mach. Learn. Res..
[15] Ding-Xuan Zhou,et al. Distributed Learning with Regularized Least Squares , 2016, J. Mach. Learn. Res..
[16] Mathieu Lauriere,et al. Unified reinforcement Q-learning for mean field game and control problems , 2022, Mathematics of Control, Signals, and Systems.
[17] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[18] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[19] Ramesh Johari,et al. Equilibria of Dynamic Games with Many Players: Existence, Approximation, and Market Structure , 2010, J. Econ. Theory.
[20] Bernhard Schölkopf,et al. Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..
[21] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[22] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[23] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[24] Qiaomin Xie,et al. Dynamic Regret of Policy Optimization in Non-stationary Environments , 2020, NeurIPS.
[25] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[26] Romuald Elie,et al. On the Convergence of Model Free Learning in Mean Field Games , 2020, AAAI.
[27] Robert Babuska,et al. Decentralized Reinforcement Learning of Robot Behaviors , 2018, Artif. Intell..
[28] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[29] Matthieu Geist,et al. A Theory of Regularized Markov Decision Processes , 2019, ICML.
[30] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[31] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[32] Shimon Whiteson,et al. Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs , 2008, ECML/PKDD.
[33] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[34] P. Lions,et al. Mean field games , 2007 .
[35] Jim Duggan,et al. An Experimental Review of Reinforcement Learning Algorithms for Adaptive Traffic Signal Control , 2016, Autonomic Road Transport Support Systems.
[36] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[37] Barnabás Póczos,et al. Two-stage sampled learning theory on distributions , 2015, AISTATS.
[38] Bart De Schutter,et al. Decentralized Reinforcement Learning Control of a Robotic Manipulator , 2006, 2006 9th International Conference on Control, Automation, Robotics and Vision.
[39] Barbara Messing,et al. An Introduction to MultiAgent Systems , 2002, Künstliche Intell..
[40] Piotr Więcek,et al. Discrete-Time Ergodic Mean-Field Games with Average Reward on Compact Spaces , 2019, Dynamic Games and Applications.
[41] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[42] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[43] Shie Mannor,et al. Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs , 2020, AAAI.
[44] Erfu Yang,et al. Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey , 2004 .
[45] Naci Saldi,et al. Fitted Q-Learning in Mean-field Games , 2019, ArXiv.
[46] Bernhard Schölkopf,et al. Learning from Distributions via Support Measure Machines , 2012, NIPS.
[47] Renyuan Xu,et al. A General Framework for Learning Mean-Field Games , 2020, Mathematics of Operations Research.
[48] P. Caines,et al. Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).
[49] Yingke Chen,et al. Decision-Theoretic Planning Under Anonymity in Agent Populations , 2017, J. Artif. Intell. Res..
[50] Tamer Basar,et al. Markov-Nash equilibria in mean-field games with discounted cost , 2016, 2017 American Control Conference (ACC).
[51] P. Lions,et al. Jeux à champ moyen. II – Horizon fini et contrôle optimal , 2006 .
[52] Yoav Shoham,et al. If multi-agent learning is the answer, what is the question? , 2007, Artif. Intell..
[53] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[54] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[55] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[56] Tamer Basar,et al. Approximate Equilibrium Computation for Discrete-Time Linear-Quadratic Mean-Field Games , 2020, 2020 American Control Conference (ACC).
[57] Jason D. Lee,et al. Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima. , 2019, 1905.10027.
[58] David C. Parkes,et al. Learning to Collaborate in Markov Decision Processes , 2019, ICML.
[59] Bernhard Schölkopf,et al. Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..
[60] Le Song,et al. A Hilbert Space Embedding for Distributions , 2007, Discovery Science.
[61] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[62] Jalaj Bhandari,et al. Global Optimality Guarantees For Policy Gradient Methods , 2019, ArXiv.
[63] Stephen Clark,et al. Emergent Communication through Negotiation , 2018, ICLR.
[64] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[65] Yongxin Chen,et al. Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games , 2019, ICLR.
[66] Jalaj Bhandari,et al. A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation , 2018, COLT.
[67] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[68] Joel Z. Leibo,et al. Social Diversity and Social Preferences in Mixed-Motive Reinforcement Learning , 2020, AAMAS.
[69] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2018, Autonomous Agents and Multi-Agent Systems.
[70] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[71] Devavrat Shah,et al. Q-learning with Nearest Neighbors , 2018, NeurIPS.
[72] J. Nash. Equilibrium Points in N-Person Games. , 1950, Proceedings of the National Academy of Sciences of the United States of America.
[73] Yuxin Chen,et al. Fast Global Convergence of Natural Policy Gradient Methods with Entropy Regularization , 2020, Oper. Res..
[74] D. Gomes,et al. Discrete Time, Finite State Space Mean Field Games , 2010 .