Planning with Expectation Models
暂无分享,去创建一个
Martha White | Yi Wan | Richard S. Sutton | Adam White | Muhammad Zaheer | R. Sutton | Adam White | Martha White | Yi Wan | M. Zaheer
[1] Katja Hofmann,et al. A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games , 2016, ICLR 2016.
[2] Martha White,et al. Learning Sparse Representations in Reinforcement Learning with Sparse Coding , 2017, IJCAI.
[3] Martha White,et al. Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains , 2018, IJCAI.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[6] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[7] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008, Texts and Readings in Mathematics.
[8] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[9] Richard S. Sutton,et al. Multi-timescale nexting in a reinforcement learning robot , 2011, Adapt. Behav..
[10] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[11] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[12] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[13] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[14] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[15] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[16] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[17] Adam M White,et al. DEVELOPING A PREDICTIVE APPROACH TO KNOWLEDGE , 2015 .
[18] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Richard S. Sutton,et al. Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming , 1990, NIPS 1990.
[21] Jean-Arcady Meyer,et al. Adaptive Behavior , 2005 .
[22] Alborz Geramifard,et al. Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping , 2008, UAI.
[23] Lihong Li,et al. An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning , 2008, ICML '08.
[24] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.