Partially Observable Markov Decision Processes and Performance Sensitivity Analysis
暂无分享,去创建一个
[1] Haitao Fang,et al. Potential-based online policy iteration algorithms for Markov decision processes , 2004, IEEE Trans. Autom. Control..
[2] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[3] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[4] Xi-Ren Cao,et al. Basic Ideas for Event-Based Optimization of Markov Systems , 2005, Discret. Event Dyn. Syst..
[5] Eric A. Hansen,et al. An Improved Policy Iteration Algorithm for Partially Observable MDPs , 1997, NIPS.
[6] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[7] Xi-Ren Cao,et al. Perturbation realization, potentials, and sensitivity analysis of Markov processes , 1997, IEEE Trans. Autom. Control..
[8] Kevin P. Murphy,et al. A Survey of POMDP Solution Techniques , 2000 .
[9] Ari Arapostathis,et al. On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion , 2006, Syst. Control. Lett..
[10] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[11] Xi-Ren Cao,et al. Stochastic learning and optimization - A sensitivity-based approach , 2007, Annual Reviews in Control.
[12] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[13] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[14] Xi-Ren Cao,et al. From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[15] Xi-Ren Cao,et al. A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases , 2004, at - Automatisierungstechnik.
[16] D. Bertsekas,et al. Approximate solution methods for partially observable markov and semi-markov decision processes , 2006 .
[17] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[18] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[19] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[20] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..