SOLVE HARD EXPLORATION PROBLEMS
暂无分享,去创建一个
Matthew W. Hoffman | Neil C. Rabinowitz | T. Paine | Caglar Gulcehre | Gabriel Barth-Maron | N. D. Freitas | Hubert Soyer | Misha Denil | Bobak Shahriari | Steven Kapturowski | Richard Tanburn | Duncan Williams
[1] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .
[2] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[3] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[4] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[5] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[6] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[7] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[8] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[9] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[10] Travis E. Oliphant,et al. Guide to NumPy , 2015 .
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[13] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[14] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[15] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.
[16] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[17] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[18] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[19] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.
[20] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[21] Stefano Ermon,et al. InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations , 2017, NIPS.
[22] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[23] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[24] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[25] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[26] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[27] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[28] Tim Salimans,et al. Learning Montezuma's Revenge from a Single Demonstration , 2018, ArXiv.
[29] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[30] Sergio Gomez Colmenarejo,et al. One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL , 2018, ArXiv.
[31] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[32] Alexander Novikov,et al. Visual Imitation with a Minimal Adversary , 2018 .
[33] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[34] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[35] Marcin Andrychowicz,et al. Overcoming Exploration in Reinforcement Learning with Demonstrations , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).
[36] Jiashi Feng,et al. Policy Optimization with Demonstrations , 2018, ICML.
[37] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[38] Albin Cassirer,et al. Randomized Prior Functions for Deep Reinforcement Learning , 2018, NeurIPS.
[39] Yoshua Bengio,et al. Reinforced Imitation in Heterogeneous Action Space , 2019, ArXiv.
[40] Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning , 2018, ICLR.
[41] Pieter Abbeel,et al. Benchmarking Model-Based Reinforcement Learning , 2019, ArXiv.
[42] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[43] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[44] Jian Peng,et al. Learning Belief Representations for Imitation Learning in POMDPs , 2019, UAI.
[45] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[46] Julian Togelius,et al. Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning , 2019, IJCAI.
[47] Marlos C. Machado,et al. Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment , 2019, ArXiv.
[48] Misha Denil,et al. Task-Relevant Adversarial Imitation Learning , 2019, CoRL.