暂无分享,去创建一个
Sam Devlin | Rahul Savani | Frans A. Oliehoek | Jacopo Castellini | Rahul Savani | F. Oliehoek | Sam Devlin | Jacopo Castellini
[1] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[2] L. Eon Bottou. Online Learning and Stochastic Approximations , 1998 .
[3] Frans A. Oliehoek,et al. Coordinated Deep Reinforcement Learners for Traffic Light Control , 2016 .
[4] Kagan Tumer,et al. Analyzing and visualizing multiagent rewards in dynamic and stochastic domains , 2008, Autonomous Agents and Multi-Agent Systems.
[5] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[6] Yan Zhang,et al. Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[7] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[8] Kagan Tumer,et al. Distributed agent-based air traffic flow management , 2007, AAMAS '07.
[9] Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..
[10] Sam Devlin,et al. Potential-based difference rewards for multiagent reinforcement learning , 2014, AAMAS.
[11] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[12] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[13] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[14] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[15] Wenwu Yu,et al. An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.
[16] Filippos Christianos,et al. Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning , 2019, ArXiv.
[17] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[18] Shimon Whiteson,et al. The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning , 2019, AAMAS.
[19] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[20] Yun Yang,et al. A Multi-Agent Framework for Packet Routing in Wireless Sensor Networks , 2015, Sensors.
[21] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[22] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[23] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[24] Peter Henderson,et al. Reward Estimation for Variance Reduction in Deep Reinforcement Learning , 2018, CoRL.
[25] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.
[26] Hoong Chuin Lau,et al. Credit Assignment For Collective Multiagent RL With Global Rewards , 2018, NeurIPS.
[27] Kee-Eung Kim,et al. Learning to Cooperate via Policy Search , 2000, UAI.
[28] Leslie Pack Kaelbling,et al. All learning is Local: Multi-agent Learning in Global Reward Games , 2003, NIPS.
[29] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[30] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[31] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[32] Kagan Tumer,et al. Approximating Difference Evaluations with Local Information , 2015, AAMAS.
[33] Kagan Tumer,et al. Modeling difference rewards for multiagent learning , 2012, AAMAS.
[34] Kagan Tumer,et al. Multi-objective Multiagent Credit Assignment Through Difference Rewards in Reinforcement Learning , 2014, SEAL.
[35] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[36] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[37] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[38] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[39] Tonghan Wang,et al. Off-Policy Multi-Agent Decomposed Policy Gradients , 2020, ArXiv.
[40] Kagan Tumer,et al. Approximating difference evaluations with local knowledge , 2014, AAMAS.
[41] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[42] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.