暂无分享,去创建一个
Shimon Whiteson | Gregory Farquhar | Bei Peng | Tabish Rashid | S. Whiteson | Gregory Farquhar | Bei Peng | Tabish Rashid | Shimon Whiteson
[1] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[2] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[3] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[4] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] R. Paul Wiegand,et al. Biasing Coevolutionary Search for Optimal Multiagent Behaviors , 2006, IEEE Transactions on Evolutionary Computation.
[7] Guillaume J. Laurent,et al. Hysteretic q-learning :an algorithm for decentralized reinforcement learning in cooperative multi-agent teams , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[8] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[9] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[10] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[11] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.
[12] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[13] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.
[14] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[15] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[16] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[17] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[18] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[19] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[20] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[21] Drew Wicke,et al. Multiagent Soft Q-Learning , 2018, AAAI Spring Symposia.
[22] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[23] Xiaoyang Tan,et al. SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multiagent Reinforcement Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[24] Lei Han,et al. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning , 2019, NeurIPS.
[25] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[26] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[27] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[28] Shimon Whiteson,et al. Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning , 2019, ArXiv.
[29] Afshin Oroojlooyjadid,et al. A review of cooperative multi-agent deep reinforcement learning , 2019, Applied Intelligence.
[30] S. Whiteson,et al. Deep Coordination Graphs , 2019, ICML.
[31] Jianye Hao,et al. Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning , 2020, ArXiv.
[32] Jinwoo Shin,et al. QOPT: Optimistic Value Function Decentralization for Cooperative Multi-Agent Reinforcement Learning , 2020, ArXiv.
[33] Chongjie Zhang,et al. QPLEX: Duplex Dueling Multi-Agent Q-Learning , 2020, ICLR.