论文信息 - Journal of Computational and Applied Mathematics Projected Equation Methods for Approximate Solution of Large Linear Systems - 字舞流文

Journal of Computational and Applied Mathematics Projected Equation Methods for Approximate Solution of Large Linear Systems

D. Bertsekas | Huizhen Yu

[1] Dimitri P. Bertsekas,et al. Convergence Results for Some Temporal Difference Methods Based on Least Squares , 2009, IEEE Transactions on Automatic Control.

[2] Dimitri P. Bertsekas,et al. Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.

[3] Vivek S. Borkar,et al. A Learning Algorithm for Risk-Sensitive Cost , 2008, Math. Oper. Res..

[4] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.

[5] D. Bertsekas,et al. A Least Squares Q-Learning Algorithm for Optimal Stopping Problems , 2007 .

[6] Mario J. Valenti. Approximate dynamic programming with applications in multi-agent systems , 2007 .

[7] D. Bertsekas,et al. Solution of Large Systems of Equations Using Approximate Dynamic Programming Methods , 2007 .

[8] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.

[9] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..

[10] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..

[11] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[12] Andrew G. Barto,et al. Linear Least-Squares Algorithms for Temporal Difference Learning , 2005, Machine Learning.

[13] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.

[14] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .

[15] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .

[16] Dimitri P. Bertsekas,et al. Least Squares Policy Evaluation Algorithms with Linear Function Approximation , 2003, Discret. Event Dyn. Syst..

[17] Jun S. Liu,et al. Monte Carlo strategies in scientific computing , 2001 .

[18] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..

[19] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[20] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .

[21] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[22] Dimitri P. Bertsekas,et al. Temporal Dierences-Based Policy Iteration and Applications in Neuro-Dynamic Programming 1 , 1997 .

[23] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[24] S. Ioffe,et al. Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming , 1996 .

[25] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[26] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[27] Andrew G. Barto,et al. Monte Carlo Matrix Inversion and Reinforcement Learning , 1993, NIPS.

[28] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .

[29] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Vol. II , 1976 .

[30] J. Halton. A Retrospective and Prospective Survey of the Monte Carlo Method , 1970 .

[31] S. Vajda,et al. Symposium on Monte Carlo Methods , 1957, The Mathematical Gazette.

[32] W. Wasow. A note on the inversion of matrices by random walks , 1952 .

[33] R. A. Leibler,et al. Matrix inversion by a Monte Carlo method , 1950 .