Off-Policy Multi-Agent Decomposed Policy Gradients
暂无分享,去创建一个
Beining Han | Chongjie Zhang | Yihan Wang | Tonghan Wang | Heng Dong | Chongjie Zhang | Tonghan Wang | Heng Dong | Yihan Wang | Beining Han
[1] Yung Yi,et al. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.
[2] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[3] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[4] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[5] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.
[6] Nikos A. Vlassis,et al. Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..
[7] Nando de Freitas,et al. Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning , 2018, ICML.
[8] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[9] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[10] Carlos Guestrin,et al. Multiagent Planning with Factored MDPs , 2001, NIPS.
[11] Alexander Peysakhovich,et al. Multi-Agent Cooperation and the Emergence of (Natural) Language , 2016, ICLR.
[12] Kagan Tumer,et al. Optimal Payoff Functions for Members of Collectives , 2001, Adv. Complex Syst..
[13] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[14] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[15] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[16] Shimon Whiteson,et al. Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2020, J. Mach. Learn. Res..
[17] Tamer Basar,et al. A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning , 2019, IFAC-PapersOnLine.
[18] Shimon Whiteson,et al. RODE: Learning Roles to Decompose Multi-Agent Tasks , 2020, ICLR.
[19] Kagan Tumer,et al. Learning sequences of actions in collectives of autonomous agents , 2002, AAMAS '02.
[20] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[21] Tamer Basar,et al. Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents , 2018, ICML.
[22] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[23] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[24] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[25] Joelle Pineau,et al. TarMAC: Targeted Multi-Agent Communication , 2018, ICML.
[26] Frans A. Oliehoek,et al. A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.
[27] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[28] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[29] Yang Yu,et al. QPLEX: Duplex Dueling Multi-Agent Q-Learning , 2020, ArXiv.
[30] Victor Lesser,et al. ROMA: Multi-Agent Reinforcement Learning with Emergent Roles , 2020, ICML.
[31] Michail G. Lagoudakis,et al. Coordinated Reinforcement Learning , 2002, ICML.
[32] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[33] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.
[34] Leslie Pack Kaelbling,et al. Off-Policy Policy Search , 2007 .
[35] Santiago Zazo,et al. Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning , 2017, ArXiv.
[36] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[37] Tamer Basar,et al. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.
[38] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[39] Yi Wu,et al. Influence-Based Multi-Agent Exploration , 2020, ICLR.
[40] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[41] Shimon Whiteson,et al. MAVEN: Multi-Agent Variational Exploration , 2019, NeurIPS.
[42] Kagan Tumer,et al. Unifying temporal and structural credit assignment problems , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[43] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[44] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.
[45] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[46] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[47] Philip H. S. Torr,et al. Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control , 2020, ArXiv.
[48] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[49] Chongjie Zhang,et al. Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning , 2020, ArXiv.
[50] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.
[51] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[52] Kun Yuan,et al. Multiagent Fully Decentralized Value Function Learning With Linear Convergence Rates , 2018, IEEE Transactions on Automatic Control.
[53] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.
[54] Shimon Whiteson,et al. The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning , 2019, AAMAS.
[55] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[56] Yan Zhang,et al. Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[57] Shimon Whiteson,et al. The StarCraft Multi-Agent Challenge , 2019, AAMAS.
[58] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[59] Pieter Abbeel,et al. On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient , 2010, NIPS.
[60] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.
[61] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[62] Victor R. Lesser,et al. Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs , 2011, AAAI.