A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
暂无分享,去创建一个
Jonathan P. How | Gerald Tesauro | Miao Liu | Chuangchuang Sun | Golnaz Habibi | Sebastian Lopez-Cot | Matthew Riemer | Dong-Ki Kim | Marwa Abdulhai
[1] Yoshua Bengio,et al. On the Optimization of a Synaptic Learning Rule , 2007 .
[2] Bart De Schutter,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .
[3] Giacomo Spigler. Meta-learnt priors slow down catastrophic forgetting in neural networks , 2019, ArXiv.
[4] M. Stanković. Multi-agent reinforcement learning , 2016 .
[5] Martha White,et al. Meta-Learning Representations for Continual Learning , 2019, NeurIPS.
[6] Jun Wang,et al. Multi-Agent Reinforcement Learning , 2020, Deep Reinforcement Learning.
[7] Gunshi Gupta,et al. La-MAML: Look-ahead Meta Learning for Continual Learning , 2020, NeurIPS.
[8] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[9] Ricardo Vilalta,et al. A Perspective View and Survey of Meta-Learning , 2002, Artificial Intelligence Review.
[10] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[11] S. Levine,et al. Gradient Surgery for Multi-Task Learning , 2020, NeurIPS.
[12] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[13] Shimon Whiteson,et al. Stable Opponent Shaping in Differentiable Games , 2018, ICLR.
[14] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[15] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[16] Nando de Freitas,et al. Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.
[17] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[18] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[19] Shimon Whiteson,et al. DiCE: The Infinitely Differentiable Monte-Carlo Estimator , 2018, ICML.
[20] J. Schulman,et al. Reptile: a Scalable Metalearning Algorithm , 2018 .
[21] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[22] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[23] Joel Lehman,et al. Learning to Continually Learn , 2020, ECAI.
[24] Drew Wicke,et al. Multiagent Soft Q-Learning , 2018, AAAI Spring Symposia.
[25] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[26] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[27] Rob Fergus,et al. Modeling Others using Oneself in Multi-Agent Reinforcement Learning , 2018, ICML.
[28] Siddhartha S. Srinivasa,et al. The Assistive Multi-Armed Bandit , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[29] Pieter Abbeel,et al. Meta-Learning with Temporal Convolutions , 2017, ArXiv.
[30] Maruan Al-Shedivat,et al. Learning Policy Representations in Multiagent Systems , 2018, ICML.
[31] Ming Zhou,et al. Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.
[32] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[33] Gerald Tesauro,et al. Learning to Learn without Forgetting By Maximizing Transfer and Minimizing Interference , 2018, ICLR.
[34] Pablo Hernandez-Leal,et al. A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity , 2017, ArXiv.
[35] Amos Storkey,et al. Meta-Learning in Neural Networks: A Survey , 2020, IEEE transactions on pattern analysis and machine intelligence.
[36] Filippos Christianos,et al. Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning , 2019, ArXiv.
[37] Min Lin,et al. Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , 2020, ArXiv.
[38] Victor R. Lesser,et al. Multi-Agent Learning with Policy Prediction , 2010, AAAI.
[39] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[40] David Vázquez,et al. Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning , 2020, NeurIPS.
[41] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[42] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[43] Philip H. S. Torr,et al. Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control , 2020, ArXiv.