暂无分享,去创建一个
Sergey Levine | Glen Berseth | John D. Co-Reyes | Suvansh Sanjeev | Abhishek Gupta | S. Levine | Abhishek Gupta | Suvansh Sanjeev | G. Berseth
[1] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[2] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[3] Sham M. Kakade,et al. Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes , 2019, COLT.
[4] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[5] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[6] Shane Legg,et al. DeepMind Lab , 2016, ArXiv.
[7] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.
[8] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[9] Mark B. Ring. CHILD: A First Step Towards Continual Learning , 1997, Machine Learning.
[10] Mengdi Wang,et al. Primal-Dual π Learning: Sample Complexity and Sublinear Run Time for Ergodic Markov Decision Problems , 2017, ArXiv.
[11] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[12] Richard Socher,et al. Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards , 2019, NeurIPS.
[13] Yee Whye Teh,et al. Progress & Compress: A scalable framework for continual learning , 2018, ICML.
[14] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[15] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[16] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[17] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[18] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[19] Sergey Levine,et al. The Ingredients of Real-World Robotic Reinforcement Learning , 2020, ICLR.
[20] Sergey Levine,et al. Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning , 2017, ICLR.
[21] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[22] Peter L. Bartlett,et al. Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning , 2000, J. Comput. Syst. Sci..
[23] Murray Shanahan,et al. Continual Reinforcement Learning with Complex Synapses , 2018, ICML.
[24] Sam Devlin,et al. Dynamic potential-based reward shaping , 2012, AAMAS.
[25] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[26] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[27] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[28] C. Karen Liu,et al. Learning symmetric and low-energy locomotion , 2018, ACM Trans. Graph..
[29] Yishay Mansour,et al. Reinforcement Learning in POMDPs Without Resets , 2005, IJCAI.
[30] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[31] Michael McCloskey,et al. Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .
[32] OctoMiao. Overcoming catastrophic forgetting in neural networks , 2016 .
[33] Andrea Lockerd Thomaz,et al. Policy Shaping: Integrating Human Feedback with Reinforcement Learning , 2013, NIPS.
[34] Rui Wang,et al. Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions , 2019, ArXiv.
[35] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[36] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[37] Marwan Mattar,et al. Unity: A General Platform for Intelligent Agents , 2018, ArXiv.
[38] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[39] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[40] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[41] Jonathan P. How,et al. Learning to Teach in Cooperative Multiagent Reinforcement Learning , 2018, AAAI.
[42] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[43] Sonia Chernova,et al. Reinforcement Learning from Demonstration through Shaping , 2015, IJCAI.
[44] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[45] Sergey Levine,et al. Learning compound multi-step controllers under unknown dynamics , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[46] John Schulman,et al. Concrete Problems in AI Safety , 2016, ArXiv.
[47] Jean-Baptiste Mouret,et al. Reset-free Trial-and-Error Learning for Robot Damage Recovery , 2016, Robotics Auton. Syst..
[48] Pieter Abbeel,et al. Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.
[49] Jiwon Kim,et al. Continual Learning with Deep Generative Replay , 2017, NIPS.