Performance Bounds in Lp-norm for Approximate Value Iteration
暂无分享,去创建一个
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] D. Pollard. Convergence of stochastic processes , 1984 .
[3] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[4] O. Hernández-Lerma,et al. Recurrence conditions for Markov decision processes with Borel state space: A survey , 1991 .
[5] A. Hordijk,et al. On ergodicity and recurrence properties of a Markov chain by an application to an open jackson network , 1992, Advances in Applied Probability.
[6] P. Bougerol,et al. Strict Stationarity of Generalized Autoregressive Processes , 1992 .
[7] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[8] Alexander J. Smola,et al. Support Vector Method for Function Approximation, Regression Estimation and Signal Processing , 1996, NIPS.
[9] John Rust. Numerical dynamic programming in economics , 1996 .
[10] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[11] S. Mallat,et al. Adaptive greedy approximations , 1997 .
[12] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[13] S. Mallat. A wavelet tour of signal processing , 1998 .
[14] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[15] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[16] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
[17] Carlos Guestrin,et al. Max-norm Projections for Factored MDPs , 2001, IJCAI.
[18] Sean P. Meyn. Stability, Performance Evaluation, and Optimization , 2002 .
[19] Rémi Munos,et al. Error Bounds for Approximate Policy Iteration , 2003, ICML.
[20] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[21] Benjamin Van Roy,et al. The Linear Programming Approach to Approximate Dynamic Programming , 2003, Oper. Res..
[22] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[23] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.