暂无分享,去创建一个
Richard Tanburn | Misha Denil | Nando de Freitas | Matthew D. Hoffman | Ziyu Wang | Çaglar Gülçehre | Gabriel Barth-Maron | Duncan Williams | Hubert Soyer | Tom Le Paine | Bobak Shahriari | Neil C. Rabinowitz | Steven Kapturowski | Worlds Team | T. Paine | Ziyun Wang | Gabriel Barth-Maron | N. D. Freitas | Çaglar Gülçehre | Hubert Soyer | Misha Denil | M. Hoffman | Bobak Shahriari | Steven Kapturowski | Richard Tanburn | Duncan Williams | Worlds Team
[1] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[2] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[3] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[4] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[5] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[6] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[7] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[8] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[9] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[10] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[11] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[14] Marlos C. Machado,et al. Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment , 2019, ArXiv.
[15] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[16] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[17] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[18] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[19] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[20] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[21] Sergio Gomez Colmenarejo,et al. One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL , 2018, ArXiv.
[22] Yoshua Bengio,et al. Reinforced Imitation in Heterogeneous Action Space , 2019, ArXiv.
[23] Julian Togelius,et al. Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.
[24] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[25] Travis E. Oliphant,et al. Guide to NumPy , 2015 .
[26] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[27] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[28] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[29] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[30] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[32] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[33] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[34] Tim Salimans,et al. Learning Montezuma's Revenge from a Single Demonstration , 2018, ArXiv.
[35] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[36] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[37] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[38] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[39] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[40] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[41] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[42] Jason Weston,et al. Curriculum learning , 2009, ICML '09.