Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
暂无分享,去创建一个
Thore Graepel | Karl Tuyls | Marc Lanctot | Paul Muller | Luke Marris | T. Graepel | Marc Lanctot | K. Tuyls | Luke Marris | Paul Muller
[1] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[2] Tom Eccles,et al. Human-Agent Cooperation in Bridge Bidding , 2020, ArXiv.
[3] Bernhard von Stengel,et al. Extensive-Form Correlated Equilibrium: Definition and Computational Complexity , 2008, Math. Oper. Res..
[4] Roy Fox,et al. Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games , 2020, NeurIPS.
[5] C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics , 1988 .
[6] G. S. Buttar,et al. A Brief Review on Different Measures of Entropy , 2019 .
[7] Christos H. Papadimitriou,et al. α-Rank: Multi-Agent Evaluation by Evolution , 2019, Scientific Reports.
[8] Michael H. Bowling,et al. Solving Common-Payoff Games with Approximate Policy Iteration , 2021, AAAI.
[9] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[10] Pierre Baldi,et al. XDO: A Double Oracle Algorithm for Extensive-Form Games , 2021, ArXiv.
[11] Paul W. Goldberg,et al. The complexity of computing a Nash equilibrium , 2006, STOC '06.
[12] Stephen P. Boyd,et al. CVXPY: A Python-Embedded Modeling Language for Convex Optimization , 2016, J. Mach. Learn. Res..
[13] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[14] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[15] Avrim Blum,et al. Planning in the Presence of Cost Functions Controlled by an Adversary , 2003, ICML.
[16] Bernd Gärtner,et al. Understanding and Using Linear Programming (Universitext) , 2006 .
[17] Guy Lever,et al. A Generalized Training Approach for Multiagent Learning , 2020, ICLR.
[18] A. Wald. Contributions to the Theory of Statistical Estimation and Testing Hypotheses , 1939 .
[19] Tom Eccles,et al. Learning to Play No-Press Diplomacy with Best Response Policy Iteration , 2020, NeurIPS.
[20] Nicola Gatti,et al. Learning to Correlate in Multi-Player General-Sum Sequential Games , 2019, NeurIPS.
[21] R. Aumann. Subjectivity and Correlation in Randomized Strategies , 1974 .
[22] Jonathan Gray,et al. Human-Level Performance in No-Press Diplomacy via Equilibrium Search , 2020, ICLR.
[23] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[24] Stephen Boyd,et al. A Rewriting System for Convex Optimization Problems , 2017, ArXiv.
[25] Tuomas Sandholm,et al. Coarse Correlation in Extensive-Form Games , 2019, AAAI.
[26] Bret Hoehn,et al. Effective short-term opponent exploitation in simplified poker , 2005, Machine Learning.
[27] D. O’Leary. A generalized conjugate gradient algorithm for solving a class of quadratic programming problems , 1977 .
[28] Stephen P. Boyd,et al. OSQP: an operator splitting solver for quadratic programs , 2017, 2018 UKACC 12th International Conference on Control (CONTROL).
[29] A. Wald. Statistical Decision Functions Which Minimize the Maximum Risk , 1945 .
[30] John C. Harsanyi,et al. Общая теория выбора равновесия в играх / A General Theory of Equilibrium Selection in Games , 1989 .
[31] Marc Lanctot,et al. Further developments of extensive-form replicator dynamics using the sequence-form representation , 2014, AAMAS.
[32] Tuomas Sandholm,et al. Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks , 2019, NeurIPS.
[33] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[34] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[35] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[36] Sriram Srinivasan,et al. OpenSpiel: A Framework for Reinforcement Learning in Games , 2019, ArXiv.
[37] Wei-Yin Loh,et al. Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..
[38] Luis E. Ortiz,et al. Maximum Entropy Correlated Equilibria , 2007, AISTATS.
[39] D. Avis,et al. Enumeration of Nash equilibria for two-player games , 2010 .
[40] J. Vial,et al. Strategically zero-sum games: The class of games whose completely mixed equilibria cannot be improved upon , 1978 .
[41] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[42] Nicola Gatti,et al. Simple Uncoupled No-regret Learning Dynamics for Extensive-form Correlated Equilibrium , 2020, J. ACM.
[43] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[44] J. Schreiber. Foundations Of Statistics , 2016 .
[45] Paul W. Goldberg,et al. The Complexity of the Homotopy Method, Equilibrium Selection, and Lemke-Howson Solutions , 2010, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.
[46] Jan Havrda,et al. Quantification method of classification processes. Concept of structural a-entropy , 1967, Kybernetika.
[47] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[48] Miroslav Dudík,et al. A Sampling-Based Approach to Computing Equilibria in Succinct Extensive-Form Games , 2009, UAI.
[49] Laurent El Ghaoui,et al. Robust Optimization , 2021, ICORES.
[50] Shu-Tao Xia,et al. Unifying attribute splitting criteria of decision trees by Tsallis entropy , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[51] Michael Bowling,et al. Hindsight and Sequential Rationality of Correlated Play , 2021, AAAI.
[52] Hans-Werner Sinn,et al. A Rehabilitation of the Principle of Insufficient Reason , 1980 .
[53] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[54] Pierre Hansen,et al. On the geometry of Nash equilibria and correlated equilibria , 2003, Int. J. Game Theory.
[55] Jorge Nocedal,et al. A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..
[56] Eric van Damme,et al. Non-Cooperative Games , 2000 .