Learning Intrinsically Motivated Options to Stimulate Policy Exploration
暂无分享,去创建一个
Steven Latré | Louis Bagot | Kevin Mets | Kevin Mets | S. Latré | L. Bagot
[1] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[2] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] Marlos C. Machado,et al. On Bonus Based Exploration Methods In The Arcade Learning Environment , 2020, ICLR.
[4] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[5] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[6] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[7] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[8] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[9] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Balaraman Ravindran,et al. Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.
[12] Filipo Studzinski Perotto. Looking for the Right Time to Shift Strategy in the Exploration-exploitation Dilemma , 2015 .
[13] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[14] George Konidaris,et al. Discovering Options for Exploration by Minimizing Cover Time , 2019, ICML.
[15] Razvan Pascanu,et al. Learning to Navigate in Complex Environments , 2016, ICLR.
[16] Sergey Levine,et al. EMI: Exploration with Mutual Information , 2018, ICML.
[17] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[18] Marlos C. Machado,et al. Count-Based Exploration with the Successor Representation , 2018, AAAI.
[19] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[20] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[21] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[22] Georg Ostrovski,et al. Temporally-Extended {\epsilon}-Greedy Exploration , 2020, 2006.01782.
[23] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[24] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[25] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[26] Marcus Hutter,et al. Count-Based Exploration in Feature Space for Reinforcement Learning , 2017, IJCAI.
[27] Yonatan Loewenstein,et al. DORA The Explorer: Directed Outreaching Reinforcement Action-Selection , 2018, ICLR.
[28] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[29] Salima Hassas,et al. A survey on intrinsic motivation in reinforcement learning , 2019, ArXiv.
[30] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[31] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[32] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[33] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[34] Sergey Levine,et al. Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? , 2019, ArXiv.
[35] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[36] Alexei A. Efros,et al. Large-Scale Study of Curiosity-Driven Learning , 2018, ICLR.
[37] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[38] Pieter Abbeel,et al. Variational Option Discovery Algorithms , 2018, ArXiv.