暂无分享,去创建一个
Alexander Liniger | Manish Prajapat | Kamyar Azizzadenesheli | Yisong Yue | Anima Anandkumar | Yisong Yue | K. Azizzadenesheli | Anima Anandkumar | Manish Prajapat | A. Liniger | Alexander Liniger
[1] Sridhar Mahadevan,et al. Global Convergence to the Equilibrium of GANs using Variational Inequalities , 2018, ArXiv.
[2] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[3] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[4] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[5] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[6] Florian Schäfer,et al. Competitive Gradient Descent , 2019, NeurIPS.
[7] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[8] Tamer Basar,et al. Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games , 2019, NeurIPS.
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] Kamyar Azizzadenesheli,et al. Policy Gradient in Partially Observable Environments: Approximation and Convergence , 2018 .
[11] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[12] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[13] Michael L. Littman,et al. Value-function reinforcement learning in Markov games , 2001, Cognitive Systems Research.
[14] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[15] Michael I. Jordan,et al. Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games , 2019, AAMAS.
[16] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[17] Jacob Andreas,et al. Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? , 2017, ICML.
[18] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[19] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[20] Haitham Bou-Ammar,et al. Balancing Two-Player Stochastic Games with Soft Q-Learning , 2018, IJCAI.
[21] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .
[22] Chuan-Sheng Foo,et al. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile , 2018, ICLR.
[23] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .
[24] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[25] George B. Dantzig,et al. Linear programming and extensions , 1965 .
[26] H. Robbins. A Stochastic Approximation Method , 1951 .
[27] Benoît Frénay,et al. QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games , 2009, ESANN.
[28] Christos H. Papadimitriou,et al. Cycles in adversarial regularized learning , 2017, SODA.
[29] Yoram Singer,et al. Convex Repeated Games and Fenchel Duality , 2006, NIPS.
[30] Tim Roughgarden,et al. Algorithmic Game Theory , 2007 .
[31] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.
[32] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[33] Ioannis Mitliagkas,et al. Negative Momentum for Improved Game Dynamics , 2018, AISTATS.
[34] Neil Burch,et al. Heads-up limit hold’em poker is solved , 2015, Science.
[35] A. Wald. Sequential Tests of Statistical Hypotheses , 1945 .
[36] Anima Anandkumar,et al. Implicit competitive regularization in GANs , 2020, ICML.
[37] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[38] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[39] Mi-Ching Tsai,et al. Robust and Optimal Control , 2014 .
[40] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[41] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[42] John Lygeros,et al. A Noncooperative Game Approach to Autonomous Racing , 2017, IEEE Transactions on Control Systems Technology.
[43] H. Barger. The General Theory of Employment, Interest and Money , 1936, Nature.
[44] Keith B. Hall,et al. Correlated Q-Learning , 2003, ICML.
[45] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[46] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[47] J. Doyle,et al. Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.
[48] R. C. Coulter,et al. Implementation of the Pure Pursuit Path Tracking Algorithm , 1992 .
[49] Todd W. Neller,et al. An Introduction to Counterfactual Regret Minimization , 2013 .
[50] Mohammad Taghi Hajiaghayi,et al. Regret minimization and the price of total anarchy , 2008, STOC.
[51] David C. Parkes,et al. The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies , 2020, ArXiv.
[52] J. Keynes. The General Theory of Employment , 1937 .
[53] Javier Peña,et al. Gradient-Based Algorithms for Finding Nash Equilibria in Extensive Form Games , 2007, WINE.
[54] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.
[55] David Silver,et al. Fictitious Self-Play in Extensive-Form Games , 2015, ICML.
[56] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[57] Sebastian Nowozin,et al. The Numerics of GANs , 2017, NIPS.
[58] Thore Graepel,et al. The Mechanics of n-Player Differentiable Games , 2018, ICML.
[59] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[60] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[61] Michael L. Littman,et al. Friend-or-Foe Q-learning in General-Sum Games , 2001, ICML.
[62] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[63] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[64] Éva Tardos,et al. No-Regret Learning in Bayesian Games , 2015, NIPS.
[65] Nesa L'abbe Wu,et al. Linear programming and extensions , 1981 .
[66] Hans B. Pacejka,et al. Tyre Modelling for Use in Vehicle Dynamics Studies , 1987 .
[67] Kevin Waugh,et al. DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.
[68] John Lygeros,et al. Optimization-Based Hierarchical Motion Planning for Autonomous Racing , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[69] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[70] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[71] Manfred Morari,et al. Optimization‐based autonomous racing of 1:43 scale RC cars , 2015, ArXiv.
[72] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[73] Olivier Pietquin,et al. Actor-Critic Fictitious Play in Simultaneous Move Multistage Games , 2018, AISTATS.
[74] Stefan Winkler,et al. The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.
[75] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[76] D. Koller,et al. Efficient Computation of Equilibria for Extensive Two-Person Games , 1996 .