Simulation-Based Methods for Markov Decision Processes
暂无分享,去创建一个
[1] G. J. Foschini,et al. Optimum Allocation of Servers to Two Types of Competing Customers , 1981, IEEE Trans. Commun..
[2] Peter W. Glynn,et al. Proceedings of Ihe 1986 Winter Simulation , 2022 .
[3] Zbigniew Dziong,et al. Dynamic link bandwidth allocation in an integrated services network , 1989, IEEE International Conference on Communications, World Prosperity Through Communications,.
[4] Pravin Varaiya,et al. Control of multiple service, multiple resource communication networks , 1991, IEEE INFCOM '91. The conference on Computer Communications. Tenth Annual Joint Comference of the IEEE Computer and Communications Societies Proceedings.
[5] Michael C. Fu,et al. Smoothed perturbation analysis derivative estimation for Markov chains , 1994, Oper. Res. Lett..
[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[7] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[8] E. Chong,et al. Stochastic optimization of regenerative systems using infinitesimal perturbation analysis , 1994, IEEE Trans. Autom. Control..
[9] Robert G. Gallager,et al. Discrete Stochastic Processes , 1995 .
[10] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[11] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[12] B. Delyon. General results on the convergence of stochastic algorithms , 1996, IEEE Trans. Autom. Control..
[13] John N. Tsitsiklis,et al. Reinforcement Learning for Call Admission Control and Routing in Integrated Service Networks , 1997, NIPS.
[14] V. Borkar. Stochastic approximation with two time scales , 1997 .
[15] John N. Tsitsiklis,et al. A neuro-dynamic programming approach to call admission control in integrated service networks : the single link case , 1997 .
[16] D. Bertsekas. Gradient convergence in gradient methods , 1997 .
[17] Keith W. Ross,et al. Multiservice Loss Models for Broadband Telecommunication Networks , 1997 .
[18] Xi-Ren Cao,et al. Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..
[19] Xi-Ren Cao,et al. Algorithms for sensitivity analysis of Markov systems through potentials and perturbation realization , 1998, IEEE Trans. Control. Syst. Technol..
[20] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[21] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .