Basis function adaptation methods for cost approximation in MDP
暂无分享,去创建一个
[1] Stephen M. Robinson,et al. An Implicit-Function Theorem for a Class of Nonsmooth Functions , 1991, Math. Oper. Res..
[2] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[3] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[4] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[5] R. Tyrrell Rockafellar,et al. Variational Analysis , 1998, Grundlehren der mathematischen Wissenschaften.
[6] Richard S. Sutton,et al. Dimensions of Reinforcement Learning , 1998 .
[7] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[8] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[9] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[10] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[11] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .
[12] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[13] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[14] David Choi,et al. A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning , 2001, Discret. Event Dyn. Syst..
[15] Vivek S. Borkar,et al. Stochastic approximation with 'controlled Markov' noise , 2006, Systems & control letters (Print).
[16] D. Bertsekas,et al. A Least Squares Q-Learning Algorithm for Optimal Stopping Problems , 2007 .
[17] D. Bertsekas,et al. Journal of Computational and Applied Mathematics Projected Equation Methods for Approximate Solution of Large Linear Systems , 2022 .
[18] R. Tyrrell Rockafellar,et al. Robinson’s implicit function theorem and its extensions , 2009, Math. Program..