Improving sample efficiency in Multi-Agent Actor-Critic methods
暂无分享,去创建一个
Xiaohong Jiang | Guanghua Song | Zhenhui Ye | Yining Chen | Bowei Yang | Sheng Fan | Bowei Yang | Xiaohong Jiang | Yining Chen | Zhenhui Ye | Guang-hua Song | Sheng Fan
[1] Fei Sha,et al. Actor-Attention-Critic for Multi-Agent Reinforcement Learning , 2018, ICML.
[2] Masashi Sugiyama,et al. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics , 2019, ArXiv.
[3] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[4] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[7] Joan Bruna,et al. Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.
[8] Mykel J. Kochenderfer,et al. Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.
[9] Vikram Manikonda,et al. A multi-agent approach to cooperative traffic management and route guidance , 2005 .
[10] Yun Yang,et al. A Multi-Agent Framework for Packet Routing in Wireless Sensor Networks , 2015, Sensors.
[11] Taisuke Kobayashi,et al. Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks , 2021, Applied Intelligence.
[12] S. Shankar Sastry,et al. Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation , 2002, IEEE Trans. Robotics Autom..
[13] Chi Harold Liu,et al. Distributed Energy-Efficient Multi-UAV Navigation for Long-Term Communication Coverage by Deep Reinforcement Learning , 2020, IEEE Transactions on Mobile Computing.
[14] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[15] Yujing Hu,et al. Multi-Agent Game Abstraction via Graph Attention Neural Network , 2019, AAAI.
[16] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[17] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[18] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[19] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[20] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[23] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[24] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[25] Sergey Levine,et al. Path integral guided policy search , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[26] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[27] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[28] Alexander G. Schwing,et al. PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning , 2019, CoRL.
[29] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[30] R. Bellman. Dynamic programming. , 1957, Science.
[31] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[32] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[33] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[34] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[35] Yadong Liu,et al. GAMA: Graph Attention Multi-agent reinforcement learning algorithm for cooperation , 2020, Applied Intelligence.
[36] Marcin Andrychowicz,et al. Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.
[37] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[38] R. Bellman. Dynamic Programming , 1957, Science.
[39] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[40] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.