Self-Imitation Learning via Trajectory-Conditioned Policy for Hard-Exploration Tasks
暂无分享,去创建一个
[1] Lambert Schomaker,et al. Self-Imitation Learning by Planning , 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA).
[2] Richard Tanburn,et al. Making Efficient Use of Demonstrations to Solve Hard Exploration Problems , 2019, ICLR.
[3] Tor Lattimore,et al. Behaviour Suite for Reinforcement Learning , 2019, ICLR.
[4] Sergey Levine,et al. Skew-Fit: State-Covering Self-Supervised Reinforcement Learning , 2019, ICML.
[5] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[6] Tim Salimans,et al. Learning Montezuma's Revenge from a Single Demonstration , 2018, ArXiv.
[7] Honglak Lee,et al. Contingency-Aware Exploration in Reinforcement Learning , 2018, ICLR.
[8] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[9] Emma Brunskill,et al. Learning Abstract Models for Long-Horizon Exploration , 2018 .
[10] Lihong Li,et al. Explicit Recall for Efficient Exploration , 2018 .
[11] Satinder Singh,et al. Generative Adversarial Self-Imitation Learning , 2018, ArXiv.
[12] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[13] J. Clune,et al. Deep Curiosity Search: Intra-Life Exploration Improves Performance on Challenging Deep Reinforcement Learning Problems , 2018, ArXiv.
[14] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[15] Nando de Freitas,et al. Playing hard exploration games by watching YouTube , 2018, NeurIPS.
[16] Qiang Liu,et al. Learning Self-Imitating Diverse Policies , 2018, ICLR.
[17] Marcin Andrychowicz,et al. Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research , 2018, ArXiv.
[18] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[19] Jitendra Malik,et al. Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[20] Sergey Levine,et al. Divide-and-Conquer Reinforcement Learning , 2017, ICLR.
[21] Stefanie Tellex,et al. Deep Abstract Q-Networks , 2017, AAMAS.
[22] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[23] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[24] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[25] Pieter Abbeel,et al. Automatic Goal Generation for Reinforcement Learning Agents , 2017, ICML.
[26] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[27] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.
[28] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.
[29] Jitendra Malik,et al. Combining self-supervised learning and imitation for vision-based rope manipulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[30] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[31] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[32] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[33] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[34] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[35] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[36] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[37] Tom Schaul,et al. Universal Value Function Approximators , 2015, ICML.
[38] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[39] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[40] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[41] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[42] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[43] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[44] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[45] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[46] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[47] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[48] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[49] Sebastian Thrun,et al. Active Exploration in Dynamic Environments , 1991, NIPS.
[50] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[51] A. P. Hyper-parameters. Count-Based Exploration with Neural Density Models , 2017 .
[52] J. Urgen Schmidhuber,et al. Adaptive confidence and adaptive curiosity , 1991, Forschungsberichte, TU Munich.