Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games
暂无分享,去创建一个
[1] J. Lavaei,et al. A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization , 2021, AISTATS.
[2] Thinh T. Doan,et al. A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning , 2021, ArXiv.
[3] Nicolas Le Roux,et al. On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging , 2021, AISTATS.
[4] Yuejie Chi,et al. Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization , 2021, NeurIPS.
[5] Haipeng Luo,et al. Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games , 2021, COLT.
[6] Guanghui Lan. Policy mirror descent for reinforcement learning: linear convergence, new sampling complexity, and generalized problem classes , 2021, Mathematical Programming.
[7] Noah Golowich,et al. Independent Policy Gradient Methods for Competitive Reinforcement Learning , 2021, NeurIPS.
[8] A. Ozdaglar,et al. Fictitious play in zero-sum stochastic games , 2020, SIAM J. Control. Optim..
[9] Yuanhao Wang,et al. Improved Algorithms for Convex-Concave Minimax Optimization , 2020, NeurIPS.
[10] Thinh T. Doan,et al. A Decentralized Policy Gradient Approach to Multi-task Reinforcement Learning , 2020, UAI.
[11] Csaba Szepesvari,et al. On the Global Convergence Rates of Softmax Policy Gradient Methods , 2020, ICML.
[12] Niao He,et al. Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems , 2020, ArXiv.
[13] Meisam Razaviyayn,et al. Efficient Search of First-Order Nash Equilibria in Nonconvex-Concave Smooth Min-Max Problems , 2020, SIAM J. Optim..
[14] Zhuoran Yang,et al. Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium , 2020, COLT.
[15] Chi Jin,et al. Provable Self-Play Algorithms for Competitive Reinforcement Learning , 2020, ICML.
[16] Peter Richt'arik,et al. Better Theory for SGD in the Nonconvex World , 2020, Trans. Mach. Learn. Res..
[17] Michael I. Jordan,et al. Near-Optimal Algorithms for Minimax Optimization , 2020, COLT.
[18] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[19] Sriram Srinivasan,et al. OpenSpiel: A Framework for Reinforcement Learning in Games , 2019, ArXiv.
[20] S. Kakade,et al. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes , 2019, COLT.
[21] Michael I. Jordan,et al. On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems , 2019, ICML.
[22] Rong Jin,et al. On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization , 2019, ICML.
[23] Tatjana Chavdarova,et al. Reducing Noise in GAN Training with Variance Reduced Extragradient , 2019, NeurIPS.
[24] Jason D. Lee,et al. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods , 2019, NeurIPS.
[25] Michael I. Jordan,et al. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? , 2019, ICML.
[26] Aryan Mokhtari,et al. A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach , 2019, AISTATS.
[27] Constantinos Daskalakis,et al. Training GANs with Optimism , 2017, ICLR.
[28] Vicenç Gómez,et al. A unified view of entropy-regularized Markov decision processes , 2017, ArXiv.
[29] T. Parthasarathy,et al. On Completely Mixed Stochastic Games , 2017, Operations Research Forum.
[30] Abhinav Gupta,et al. Robust Adversarial Reinforcement Learning , 2017, ICML.
[31] Dale Schuurmans,et al. Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.
[32] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[33] Mark W. Schmidt,et al. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.
[34] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.
[35] Martin A. Riedmiller,et al. On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[36] I. Kaplansky. A contribution to von Neumann's theory of games. II , 1995 .
[37] R. McKelvey,et al. Quantal Response Equilibria for Normal Form Games , 1995 .
[38] P. Bernhard,et al. On a theorem of Danskin with an application to a theorem of Von Neumann-Sion , 1995 .
[39] T. Raghavan. Completely mixed games and M-matrices , 1978 .
[40] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[41] I. Kaplansky. A Contribution to Von Neumann's Theory of Games , 1945 .
[42] Shaocong Ma,et al. Sample Efficient Stochastic Policy Extragradient Algorithm for Zero-Sum Markov Game , 2022, ICLR.
[43] Yuandong Tian,et al. Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum Markov Games , 2021, ArXiv.