Partially Observable Markov Decision Processes With Reward Information: Basic Ideas and Models
暂无分享,去创建一个
[1] S. Ross. Arbitrary State Markovian Decision Processes , 1968 .
[2] T. Yoshikawa,et al. Discrete-Time Markovian Decision Processes with Incomplete State Observation , 1970 .
[3] S. Ross. Quality Control under Markovian Deterioration , 1971 .
[4] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[5] D. Rhenius. Incomplete Information in Markovian Decision Models , 1974 .
[6] Robert C. Wang. Computing optimal quality control policies — two actions , 1976 .
[7] Robert C. Wang,et al. OPTIMAL REPLACEMENT POLICY WITH UNOBSERVABLE STATES , 1977 .
[8] C. White. Optimal control-limit strategies for a partially observed replacement problem† , 1979 .
[9] C. White. Bounds on optimal cost for a replacement problem with partial observations , 1979 .
[10] H. Mine,et al. An Optimal Inspection and Replacement Policy under Incomplete State Information: Average Cost Criterion , 1984 .
[11] Hajime Kawai,et al. An optimal inspection and replacement policy under incomplete state information , 1986 .
[12] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .
[13] William S. Lovejoy,et al. Some Monotonicity Results for Partially Observed Markov Decision Processes , 1987, Oper. Res..
[14] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[15] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .
[16] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[17] W. Fleming. Book Review: Discrete-time Markov control processes: Basic optimality criteria , 1997 .
[18] O. Hernández-Lerma,et al. Further topics on discrete-time Markov control processes , 1999 .
[19] Vivek S. Borkar,et al. Average Cost Dynamic Programming Equations For Controlled Markov Chains With Partial Observations , 2000, SIAM J. Control. Optim..
[20] Limiting discounted-cost control of partially observable stochastic systems , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).
[21] Sanjeev R. Kulkarni,et al. Finite-time lower bounds for the two-armed bandit problem , 2000, IEEE Trans. Autom. Control..
[22] V. Borkar. Dynamic programming for ergodic control with partial observations , 2003 .
[23] Xi-Ren Cao,et al. A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases , 2004, at - Automatisierungstechnik.
[24] Xi-Ren Cao,et al. Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards , 2005, SIAM J. Control. Optim..
[25] B. Nordstrom. FINITE MARKOV CHAINS , 2005 .
[26] Y. Ho,et al. Vector Ordinal Optimization , 2005 .
[27] L. Platzman. Optimal Infinite-Horizon Undiscounted Control of Finite Probabilistic Systems , 2006 .
[28] Yu-Chi Ho,et al. Constrained Ordinal Optimization—A Feasibility Model Based Approach , 2006, Discret. Event Dyn. Syst..