LSPI with Random Projections
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.
[2] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[3] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[4] Santosh S. Vempala,et al. The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.
[5] Sridhar Mahadevan,et al. Representation Policy Iteration , 2005, UAI.
[6] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[9] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[10] M. Loth,et al. Sparse Temporal Difference Learning Using LASSO , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[11] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[12] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[13] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[14] Shie Mannor,et al. Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems , 2009, 2009 American Control Conference.
[15] Rémi Munos,et al. Compressed Least-Squares Regression , 2009, NIPS.
[16] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[17] M. Rudelson,et al. Non-asymptotic theory of random matrices: extreme singular values , 2010, 1003.2990.
[18] Marek Petrik,et al. Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes , 2010, ICML.
[19] Alessandro Lazaric,et al. Finite-Sample Analysis of LSTD , 2010, ICML.
[20] R. Munos,et al. Brownian Motions and Scrambled Wavelets for Least-Squares Regression , 2010 .
[21] Alessandro Lazaric,et al. Finite-sample analysis of least-squares policy iteration , 2012, J. Mach. Learn. Res..