Unifying Count-Based Exploration and Intrinsic Motivation
暂无分享,去创建一个
Tom Schaul | Marc G. Bellemare | Rémi Munos | David Saxton | Sriram Srinivasan | Georg Ostrovski | T. Schaul | S. Srinivasan | Georg Ostrovski | D. Saxton | R. Munos
[1] R. W. White. Motivation reconsidered: the concept of competence. , 1959, Psychological review.
[2] R. Bellman. Dynamic programming. , 1957, Science.
[3] Jürgen Schmidhuber,et al. A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .
[4] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[5] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[6] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[8] Yoram Singer,et al. Efficient Bayesian Parameter Estimation in Large Discrete Domains , 1998, NIPS.
[9] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[10] Yishay Mansour,et al. Convergence of Optimistic and Incremental Q-Learning , 2001, NIPS.
[11] Andrew G. Barto,et al. Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .
[12] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[13] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[14] Mark B. Ring. CHILD: A First Step Towards Continual Learning , 1997, Machine Learning.
[15] Marcus Hutter. Simulation Algorithms for Computational Systems Biology , 2017, Texts in Theoretical Computer Science. An EATCS Series.
[16] Andrew G. Barto,et al. An intrinsic reward mechanism for efficient exploration , 2006, ICML.
[17] Jesse Hoey,et al. An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.
[18] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[19] Peter Auer,et al. Near-optimal Regret Bounds for Reinforcement Learning , 2008, J. Mach. Learn. Res..
[20] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[21] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[22] Jürgen Schmidhuber,et al. Driven by Compression Progress , 2008, KES.
[23] Michael Bowling,et al. Dual Representations for Dynamic Programming , 2008 .
[24] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..
[25] Andrew Y. Ng,et al. Near-Bayesian exploration in polynomial time , 2009, ICML '09.
[26] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[27] Hilbert J. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[28] Doina Precup,et al. An information-theoretic approach to curiosity-driven reinforcement learning , 2012, Theory in Biosciences.
[29] Tom Schaul,et al. Curiosity-driven optimization , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).
[30] Olivier Buffet,et al. Near-Optimal BRL using Optimistic Local Transitions , 2012, ICML.
[31] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[32] Tor Lattimore,et al. PAC Bounds for Discounted MDPs , 2012, ALT.
[33] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[34] Odalric-Ambrym Maillard,et al. Hierarchical Optimistic Region Selection driven by Curiosity , 2012, NIPS.
[35] Pierre-Yves Oudeyer,et al. Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.
[36] Marcus Hutter,et al. Sparse Adaptive Dirichlet-Multinomial-like Processes , 2013, COLT.
[37] Laurent Orseau,et al. Universal Knowledge-Seeking Agents for Stochastic Environments , 2013, ALT.
[38] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[39] Marc G. Bellemare,et al. Skip Context Tree Switching , 2014, ICML.
[40] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[41] Marc G. Bellemare,et al. Compress and Control , 2015, AAAI.
[42] Marc G. Bellemare. Count-Based Frequency Estimation with Bounded Memory , 2015, IJCAI.
[43] Marlos C. Machado,et al. Domain-Independent Optimistic Initialization for Reinforcement Learning , 2014, AAAI Workshop: Learning for General Competency in Video Games.
[44] Shakir Mohamed,et al. Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.
[45] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[46] Yann Ollivier,et al. Laplace's Rule of Succession in Information Geometry , 2015, GSI.
[47] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[48] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[49] Laurent Orseau,et al. Thompson Sampling is Asymptotically Optimal in General Environments , 2016, UAI.
[50] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[51] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[52] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[53] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[54] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[55] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[56] Jason Pazis,et al. Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates , 2016, AAAI.
[57] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[58] J. Schulman,et al. Variational Information Maximizing Exploration , 2016 .
[59] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.