FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning
暂无分享,去创建一个
Guangming Xie | Zongqing Lu | Yueheng Li | Tianhao Zhang | Chen Wang | G. Xie | Chen Wang | Zongqing Lu | Yueheng Li | Tianhao Zhang
[1] Nikos A. Vlassis,et al. Optimal and Approximate Q-value Functions for Decentralized POMDPs , 2008, J. Artif. Intell. Res..
[2] Joelle Pineau,et al. TarMAC: Targeted Multi-Agent Communication , 2018, ICML.
[3] Drew Wicke,et al. Multiagent Soft Q-Learning , 2018, AAAI Spring Symposia.
[4] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[5] Zongqing Lu,et al. Learning Fairness in Multi-Agent Systems , 2019, NeurIPS.
[6] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[7] Peter A. Beling,et al. Value-Decomposition Multi-Agent Actor-Critics , 2021, AAAI.
[8] Shimon Whiteson,et al. The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning , 2019, AAMAS.
[9] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.
[10] Yuan Qi,et al. Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning , 2019, NeurIPS.
[11] Yang Yu,et al. QPLEX: Duplex Dueling Multi-Agent Q-Learning , 2020, ArXiv.
[12] LukeSean,et al. Lenient learning in independent-learner stochastic cooperative games , 2016 .
[13] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] Tonghan Wang,et al. Off-Policy Multi-Agent Decomposed Policy Gradients , 2020, ArXiv.
[16] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[17] Zongqing Lu,et al. Learning Individually Inferred Communication for Multi-Agent Cooperation , 2020, NeurIPS.
[18] Tom Eccles,et al. Learning Reciprocity in Complex Sequential Social Dilemmas , 2019, ArXiv.
[19] Shimon Whiteson,et al. Weighted QMIX: Expanding Monotonic Value Function Factorisation , 2020, NeurIPS.
[20] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[21] Shimon Whiteson,et al. Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2020, J. Mach. Learn. Res..
[22] R. Paul Wiegand,et al. Biasing Coevolutionary Search for Optimal Multiagent Behaviors , 2006, IEEE Transactions on Evolutionary Computation.
[23] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[24] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[25] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[26] Jianye Hao,et al. Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning , 2020, ArXiv.
[27] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[28] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[29] Philip H. S. Torr,et al. Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control , 2020, ArXiv.
[30] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.
[31] Zhi Zhang,et al. Integrating Independent and Centralized Multi-agent Reinforcement Learning for Traffic Signal Network Optimization , 2019, AAMAS.
[32] Dong Chen,et al. SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving , 2020, ArXiv.
[33] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[34] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[35] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[36] Huizhu Jia,et al. Hierarchically and Cooperatively Learning Traffic Signal Control , 2021, AAAI.
[37] Bikramjit Banerjee,et al. Multi-agent reinforcement learning as a rehearsal for decentralized planning , 2016, Neurocomputing.