Stable predictive representations with general value functions for continual learning
暂无分享,去创建一个
[1] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[2] Adam M White,et al. DEVELOPING A PREDICTIVE APPROACH TO KNOWLEDGE , 2015 .
[3] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.
[4] Michael H. Bowling,et al. Online Discovery and Learning of Predictive State Representations , 2005, NIPS.
[5] Eric Wiewiora,et al. Learning predictive representations from a history , 2005, ICML.
[6] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.
[7] Satinder P. Singh,et al. Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems , 2006, ICML.
[8] Richard S. Sutton,et al. Temporal Abstraction in Temporal-difference Networks , 2005, NIPS.
[9] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[10] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[11] Byron Boots,et al. Closing the learning-planning loop with predictive state representations , 2009, Int. J. Robotics Res..
[12] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[13] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.
[14] Michael R. James,et al. Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.
[15] Satinder P. Singh,et al. Predictive state representations with options , 2006, ICML.
[16] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[17] Yaoliang Yu,et al. Minimizing Nonconvex Non-Separable Functions , 2015, AISTATS.
[18] Richard S. Sutton,et al. Temporal-Difference Networks with History , 2005, IJCAI.
[19] David Silver,et al. Gradient Temporal Difference Networks , 2012, EWRL.
[20] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[21] Sebastian Thrun,et al. Learning low dimensional predictive representations , 2004, ICML.
[22] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[23] Bo Liu,et al. Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces , 2014, ArXiv.