暂无分享,去创建一个
Hanjiang Lai | Xinrun Wang | Bo An | Zhenyu Shi | Rundong Wang | Xinwen Hou | Runsheng Yu | Buhong Liu | Xinwen Hou | Hanjiang Lai | Xinrun Wang | R. Wang | Runsheng Yu | Bo An | Buhong Liu | Zhenyu Shi
[1] Dmitri Botvich,et al. Multi-agent Learning for Resource Allocationn Dense Heterogeneous 5G Network , 2015, 2015 International Conference on Engineering and Telecommunication (EnT).
[2] Danna Zhou,et al. d. , 1934, Microbial pathogenesis.
[3] Erfu Yang,et al. Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey , 2004 .
[4] Olivier Buffet,et al. Optimally Solving Dec-POMDPs as Continuous-State MDPs , 2013, IJCAI.
[5] Tuomas Sandholm,et al. Deep Counterfactual Regret Minimization , 2018, ICML.
[6] Hoong Chuin Lau,et al. Credit Assignment For Collective Multiagent RL With Global Rewards , 2018, NeurIPS.
[7] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .
[8] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[9] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[10] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[11] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2018, Autonomous Agents and Multi-Agent Systems.
[12] Michael H. Bowling,et al. Actor-Critic Policy Optimization in Partially Observable Multiagent Environments , 2018, NeurIPS.
[13] David Hsu,et al. Particle Filter Networks with Application to Visual Localization , 2018, CoRL.
[14] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[15] Weinan Zhang,et al. MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence , 2017, AAAI.
[16] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[17] Taeyoung Lee,et al. Learning to Schedule Communication in Multi-agent Reinforcement Learning , 2019, ICLR.
[18] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[19] Michael H. Bowling,et al. Regret Minimization in Games with Incomplete Information , 2007, NIPS.
[20] Tsuyoshi Murata,et al. {m , 1934, ACML.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Kurt Keutzer,et al. Regret Minimization for Partially Observable Deep Reinforcement Learning , 2017, ICML.
[23] Sam Devlin,et al. Potential-based difference rewards for multiagent reinforcement learning , 2014, AAMAS.
[24] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[25] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[26] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[27] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[28] Nando de Freitas,et al. An Introduction to Sequential Monte Carlo Methods , 2001, Sequential Monte Carlo Methods in Practice.
[29] Frans A. Oliehoek,et al. Bayesian Reinforcement Learning in Factored POMDPs , 2018, AAMAS.
[30] Peng Peng,et al. Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.
[31] Feng Wu,et al. Multi-Agent Planning with Baseline Regret Minimization , 2017, IJCAI.