An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming
暂无分享,去创建一个
Michael C. Fu | Jiaqiao Hu | Hyeong Soo Chang | Steven I. Marcus | S. Marcus | M. Fu | Jiaqiao Hu | H. Chang
[1] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[2] Christos G. Cassandras,et al. Ordinal optimisation and simulation , 2000, J. Oper. Res. Soc..
[3] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Y. Freund,et al. Adaptive game playing using multiplicative weights , 1999 .
[6] Alexander Shapiro,et al. The Sample Average Approximation Method for Stochastic Discrete Optimization , 2002, SIAM J. Optim..
[7] Shie Mannor,et al. PAC Bounds for Multi-armed Bandit and Markov Decision Processes , 2002, COLT.
[8] Jason H. Goodfriend,et al. Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method , 1995 .
[9] Robert Givan,et al. Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes , 2004, Discret. Event Dyn. Syst..
[10] Michael C. Fu,et al. An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..
[11] Kaddour Najim,et al. Learning automata and stochastic optimization , 1997 .
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] S. Marcus,et al. An asymptotically efficient algorithm for finite horizon stochastic dynamic programming problems , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).
[14] Charles Leake,et al. Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method , 1994 .
[15] W. Fleming. Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .
[16] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .
[17] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[18] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[19] F. Topsøe. BOUNDS FOR ENTROPY AND DIVERGENCE FOR DISTRIBUTIONS OVER A TWO-ELEMENT SET , 2001 .
[20] Jim Freeman,et al. Stochastic Processes (Second Edition) , 1996 .
[21] A. Shwartz,et al. Guaranteed performance regions in Markovian systems with competing decision makers , 1993, IEEE Trans. Autom. Control..
[22] Sheldon M. Ross,et al. Stochastic Processes , 2018, Gauge Integral Structures for Stochastic Calculus and Quantum Electrodynamics.