VIME: Variational Information Maximizing Exploration
暂无分享,去创建一个
Filip De Turck | Pieter Abbeel | Xi Chen | John Schulman | Yan Duan | Rein Houthooft | J. Schulman | P. Abbeel | Rein Houthooft | Yan Duan | F. Turck | Xi Chen | John Schulman
[1] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[2] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[3] S. Hochreiter,et al. REINFORCEMENT DRIVEN INFORMATION ACQUISITION IN NONDETERMINISTIC ENVIRONMENTS , 1995 .
[4] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[5] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[6] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.
[7] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[8] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[9] Stefan Schaal,et al. Reinforcement learning by reward-weighted regression for operational space control , 2007, ICML '07.
[10] Jürgen Schmidhuber,et al. Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity & Creativity , 2007, Discovery Science.
[11] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[12] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[13] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[14] Pierre Baldi,et al. Bayesian surprise attracts human attention , 2005, Vision Research.
[15] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[16] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[17] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[18] Yi Sun,et al. Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments , 2011, AGI.
[19] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[20] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[21] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[22] Keyan Zahedi,et al. Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis , 2013, Front. Psychol..
[23] Friedrich T. Sommer,et al. Learning and exploration in action-perception loops , 2013, Front. Neural Circuits.
[24] Jason Pazis,et al. PAC Optimal Exploration in Continuous Space Markov Decision Processes , 2013, AAAI.
[25] Peter Dayan,et al. Bayes-Adaptive Simulation-based Search with Value Function Approximation , 2014, NIPS.
[26] Mikhail Prokopenko,et al. Guided Self-Organization: Inception , 2014 .
[27] Shie Mannor,et al. Bayesian Reinforcement Learning: A Survey , 2015, Found. Trends Mach. Learn..
[28] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[29] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[32] Julien Cornebise,et al. Weight Uncertainty in Neural Networks , 2015, ArXiv.
[33] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[34] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[35] Ariel D. Procaccia,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.
[36] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[37] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[38] Keyan Zahedi,et al. Information Theoretically Aided Reinforcement Learning for Embodied Agents , 2016, ArXiv.
[39] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[40] Arend Hintze,et al. Information-theoretic neuro-correlates boost evolution of cognitive systems , 2015, Entropy.
[41] Andrea Lockerd Thomaz,et al. Exploration from Demonstration for Interactive Reinforcement Learning , 2016, AAMAS.
[42] Peter Stone,et al. Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..
[43] John Langford,et al. Efficient Exploration in Reinforcement Learning , 2017, Encyclopedia of Machine Learning and Data Mining.