Quasi Newton Temporal Difference Learning
暂无分享,去创建一个
[1] Zhi-Qiang Liu,et al. Preconditioned temporal difference learning , 2008, ICML '08.
[2] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[3] Alborz Geramifard,et al. Incremental Least-Squares Temporal Difference Learning , 2006, AAAI.
[4] Shie Mannor,et al. Reinforcement learning with Gaussian processes , 2005, ICML.
[5] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[9] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..
[10] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[11] H. Robbins. A Stochastic Approximation Method , 1951 .
[12] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[13] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[14] Andrew G. Barto,et al. Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.
[15] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[16] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[17] Patrick Gallinari,et al. Erratum: SGDQN is Less Careful than Expected , 2010, J. Mach. Learn. Res..
[18] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..