Online policy iteration algorithm for semi-Markov switching state-space control processes
暂无分享,去创建一个
[1] Xi-Ren Cao,et al. Stochastic learning and optimization - A sensitivity-based approach , 2007, Annual Reviews in Control.
[2] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[3] Xi-Ren Cao,et al. The potential structure of sample paths and performance sensitivities of Markov systems , 2004, IEEE Transactions on Automatic Control.
[4] Xi-Ren Cao,et al. Semi-Markov decision problems and performance sensitivity analysis , 2003, IEEE Trans. Autom. Control..
[5] William L. Cooper,et al. CONVERGENCE OF SIMULATION-BASED POLICY ITERATION , 2003, Probability in the Engineering and Informational Sciences.
[6] Xi-Ren Cao,et al. Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..
[7] Abhijit Gosavi,et al. A Reinforcement Learning Algorithm Based on Policy Iteration for Average Reward: Empirical Results with Yield Management and Convergence Analysis , 2004, Machine Learning.
[8] Xi-Ren Cao,et al. Basic Ideas for Event-Based Optimization of Markov Systems , 2005, Discret. Event Dyn. Syst..
[9] Zhiyuan Ren,et al. Switching control in multi-mode Markov decision processes , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).
[10] P. Varaiya,et al. Multilayer control of large Markov chains , 1978 .
[11] Haitao Fang,et al. Potential-based online policy iteration algorithms for Markov decision processes , 2004, IEEE Trans. Autom. Control..
[12] Klara Nahrstedt,et al. Distributed multimedia service composition with statistical QoS assurances , 2006, IEEE Transactions on Multimedia.
[13] Zhiyuan Ren,et al. A time aggregation approach to Markov decision processes , 2002, Autom..
[14] G.-P. Dai,et al. Performance optimization algorithms based on potentials for semi-Markov control processes , 2005 .