Generalized Classication-bas ed Approximate Policy Iteration
暂无分享,去创建一个
[1] Alessandro Lazaric,et al. Analysis of a Classification-based Policy Iteration Algorithm , 2010, ICML.
[2] Alex M. Andrew,et al. ROBOT LEARNING, edited by Jonathan H. Connell and Sridhar Mahadevan, Kluwer, Boston, 1993/1997, xii+240 pp., ISBN 0-7923-9365-1 (Hardback, 218.00 Guilders, $120.00, £89.95). , 1999, Robotica (Cambridge. Print).
[3] Bruno Scherrer,et al. Classification-based Policy Iteration with a Critic , 2011, ICML.
[4] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[5] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[6] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[8] A. Tsybakov,et al. Fast learning rates for plug-in classifiers , 2007, 0708.2321.
[9] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[10] Csaba Szepesvári,et al. Error Propagation for Approximate Policy and Value Iteration , 2010, NIPS.
[11] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[12] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.
[13] P. Bartlett,et al. Local Rademacher complexities , 2005, math/0508275.
[14] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[15] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[16] Matthew W. Hoffman,et al. Finite-Sample Analysis of Lasso-TD , 2011, ICML.
[17] Csaba Szepesv. Algorithms for Reinforcement Learning , 2010 .
[18] Amir Massoud Farahmand,et al. Action-Gap Phenomenon in Reinforcement Learning , 2011, NIPS.
[19] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[20] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[21] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.