暂无分享,去创建一个
[1] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[2] Shie Mannor,et al. Policy Gradients with Variance Related Risk Criteria , 2012, ICML.
[3] Max Jaderberg,et al. Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.
[4] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[5] Michael P. Wellman,et al. Structure Learning for Approximate Solution of Many-Player Games , 2020, AAAI.
[6] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[7] Peter Duersch,et al. Pure strategy equilibria in symmetric two-player zero-sum games , 2011, Int. J. Game Theory.
[8] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[9] Sergey Levine,et al. Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.
[10] Michael I. Jordan,et al. RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.
[11] K. Madhava Krishna,et al. Parameter Sharing Reinforcement Learning Architecture for Multi Agent Driving Behaviors , 2018, ArXiv.
[12] David C. Parkes,et al. The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies , 2020, ArXiv.
[13] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[14] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[15] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[16] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[17] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[18] Peter Duersch,et al. Pure strategy equilibria in symmetric two-player zero-sum games , 2012, Int. J. Game Theory.
[19] A. Hefti,et al. Equilibria in symmetric games : theory and applications , 2017 .
[20] Amir Sani,et al. Agent-Based Model Calibration Using Machine Learning Surrogates , 2017, 1703.10639.
[21] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[22] Priscilla Avegliano,et al. Using Surrogate Models to Calibrate Agent-based Model Parameters Under Data Scarcity , 2019, AAMAS.
[23] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[24] Shie Mannor,et al. Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning , 2017, COLT.
[25] Karl Tuyls,et al. Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent , 2019, IJCAI.
[26] J. Tsitsiklis,et al. Convergence rate of linear two-time-scale stochastic approximation , 2004, math/0405287.
[27] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[28] Jitendra Malik,et al. Learning to Optimize , 2016, ICLR.
[29] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[30] Tianlong Chen,et al. Learning to Optimize in Swarms , 2019, NeurIPS.
[31] Martijn C. Schut,et al. Reinforcement Learning for Online Control of Evolutionary Algorithms , 2006, ESOA.