Solution of Large Systems of Equations Using Approximate Dynamic Programming Methods
暂无分享,去创建一个
[1] B. Rozovskii,et al. Optimal Stopping of Markov Processes , 1978 .
[2] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[5] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .
[6] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[7] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .
[8] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[9] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[10] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[11] Tim Hesterberg,et al. Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.
[12] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..
[13] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[14] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[15] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[16] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .
[17] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[18] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[19] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..
[20] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[21] Mario J. Valenti. Approximate dynamic programming with applications in multi-agent systems , 2007 .
[22] D. Bertsekas,et al. A Least Squares Q-Learning Algorithm for Optimal Stopping Problems , 2007 .
[23] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[24] Vivek S. Borkar,et al. A Learning Algorithm for Risk-Sensitive Cost , 2008, Math. Oper. Res..
[25] Dimitri P. Bertsekas,et al. Convergence Results for Some Temporal Difference Methods Based on Least Squares , 2009, IEEE Transactions on Automatic Control.
[26] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.