A potential-based method for finite-stage Markov Decision Process
暂无分享,去创建一个
[1] Xi-Ren Cao,et al. A unified approach to Markov decision problems and performance sensitivity analysis , 2000, at - Automatisierungstechnik.
[2] Xi-Ren Cao,et al. A basic formula for online policy gradient algorithms , 2005, IEEE Transactions on Automatic Control.
[3] Xi-Ren Cao,et al. The Relations Among Potentials, Perturbation Analysis, and Markov Decision Processes , 1998, Discret. Event Dyn. Syst..
[4] Panos M. Pardalos,et al. Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..
[5] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .
[6] E. Chong,et al. Stochastic optimization of regenerative systems using infinitesimal perturbation analysis , 1994, IEEE Trans. Autom. Control..
[7] W. Fleming. Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .
[8] Samuel Karlin,et al. A First Course on Stochastic Processes , 1968 .
[9] L. Breuer. Introduction to Stochastic Processes , 2022, Statistical Methods for Climate Scientists.
[10] Eugene A. Feinberg,et al. Handbook of Markov Decision Processes , 2002 .
[11] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] John N. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes , 2001, IEEE Trans. Autom. Control..
[14] John G. Kemeny,et al. Finite Markov Chains. , 1960 .
[15] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[16] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[17] Xi-Ren Cao,et al. Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..
[18] B. Nordstrom. FINITE MARKOV CHAINS , 2005 .
[19] A. Shwartz,et al. Handbook of Markov decision processes : methods and applications , 2002 .
[20] Xi-Ren Cao,et al. From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[21] Xi-Ren Cao,et al. Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization , 1998, IEEE Trans. Control. Syst. Technol..
[22] Erhan Çinlar,et al. Introduction to stochastic processes , 1974 .
[23] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[24] Xi-Ren Cao,et al. Stochastic learning and optimization - A sensitivity-based approach , 2007, Annu. Rev. Control..
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[26] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .