暂无分享,去创建一个
Zidong Du | Yunji Chen | Xishan Zhang | Ling Li | Chen Zhang | Rui Zhang | Qi Guo | Ruizhi Chen | Xiaoyu Wu | Yansong Pan | Kaizhao Yuan | TianYun Ma | JiYuan Liang | Kai Wang | Shaohui Peng | Zidong Du | Yunji Chen | Qi Guo | Ling Li | Tianyun Ma | Rui Zhang | Xishan Zhang | Ruizhi Chen | Shaohui Peng | Yansong Pan | Xiaoyu Wu | Kai Wang | Kaizhao Yuan | JiYuan Liang | Chen Zhang
[1] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[2] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[3] Thomas Lukasiewicz,et al. Diversity-Driven Extensible Hierarchical Reinforcement Learning , 2018, AAAI.
[4] Sergey Levine,et al. Meta-Learning with Implicit Gradients , 2019, NeurIPS.
[5] Zihan Zhou,et al. CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario , 2019, WWW.
[6] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[7] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[8] Massimiliano Pontil,et al. The Benefit of Multitask Representation Learning , 2015, J. Mach. Learn. Res..
[9] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[10] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[11] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[12] Kate Saenko,et al. Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.
[13] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[14] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[15] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[16] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[17] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[18] Joshua Achiam,et al. On First-Order Meta-Learning Algorithms , 2018, ArXiv.
[19] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[20] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[21] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[22] Siyuan Li,et al. Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards , 2019, NeurIPS.
[23] Sergey Levine,et al. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning , 2019, CoRL.
[24] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[25] Qusay H. Mahmoud,et al. A Survey of Multi-Task Deep Reinforcement Learning , 2020, Electronics.
[26] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[27] Deepak Pathak,et al. Self-Supervised Exploration via Disagreement , 2019, ICML.
[28] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[29] Honglak Lee,et al. Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion , 2018, NeurIPS.
[30] Sergey Levine,et al. Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning , 2021, ICML.
[31] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[32] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[33] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[34] Yoshua Bengio,et al. Bayesian Model-Agnostic Meta-Learning , 2018, NeurIPS.
[35] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[36] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[37] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[38] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[39] Kenneth O. Stanley,et al. First return then explore , 2021, Nature.
[40] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[41] Sergey Levine,et al. Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.
[42] Daniel Guo,et al. Never Give Up: Learning Directed Exploration Strategies , 2020, ICLR.
[43] Pieter Abbeel,et al. Stochastic Neural Networks for Hierarchical Reinforcement Learning , 2016, ICLR.
[44] R. McFarlane. A Survey of Exploration Strategies in Reinforcement Learning , 2003 .
[45] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[46] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[47] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[48] Chrisantha Fernando,et al. PathNet: Evolution Channels Gradient Descent in Super Neural Networks , 2017, ArXiv.
[49] Gabriel Kalweit,et al. Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning , 2017, CoRL.
[50] Rob Fergus,et al. Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning , 2018, ArXiv.
[51] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[52] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.
[53] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[54] Wenlong Fu,et al. Model-based reinforcement learning: A survey , 2018 .
[55] Lianlong Wu,et al. Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence , 2019, AAAI.
[56] Max Jaderberg,et al. Open-Ended Learning Leads to Generally Capable Agents , 2021, ArXiv.
[57] Joelle Pineau,et al. Benchmarking Batch Deep Reinforcement Learning Algorithms , 2019, ArXiv.
[58] Pieter Abbeel,et al. Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.
[59] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.