Error Propagation for Approximate Policy and Value Iteration
暂无分享,去创建一个
[1] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[2] Alessandro Lazaric,et al. Finite-sample Analysis of Bellman Residual Minimization , 2010, ACML.
[3] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[4] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[5] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[6] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Daniel Polani,et al. Least Squares SVM for Least Squares TD Learning , 2006, ECAI.
[9] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[10] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[11] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[12] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[13] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[14] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[15] Shie Mannor,et al. Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems , 2009, 2009 American Control Conference.
[16] Shie Mannor,et al. Regularized Policy Iteration , 2008, NIPS.
[17] Alborz Geramifard,et al. iLSTD: Eligibility Traces and Convergence Analysis , 2006, NIPS.
[18] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[19] Gavin Taylor,et al. Kernelized value function approximation for reinforcement learning , 2009, ICML '09.
[20] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[21] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[22] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.