论文信息 - A survey of algorithmic methods for partially observed Markov decision processes

A survey of algorithmic methods for partially observed Markov decision processes

A partially observed Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. The significant applied potential for such processes remains largely unrealized, due to an historical lack of tractable solution methodologies. This paper reviews some of the current algorithmic alternatives for solving discrete-time, finite POMDPs over both finite and infinite horizons. The major impediment to exact solution is that, even with a finite set of internal system states, the set of possible information states is uncountably infinite. Finite algorithms are theoretically available for exact solution of the finite horizon problem, but these are computationally intractable for even modest-sized problems. Several approximation methodologies are reviewed that have the potential to generate computationally feasible, high precision solutions.

W. Lovejoy

[1] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .

[2] D. Blackwell. Discounted Dynamic Programming , 1965 .

[3] James S. Kakalik,et al. OPTIMUM POLICIES FOR PARTIALLY OBSERVABLE MARKOV SYSTEMS , 1965 .

[4] M. Aoki. Optimal control of partially observable Markovian systems , 1965 .

[5] J. MacQueen,et al. Letter to the Editor - A Test for Suboptimal Actions in Markovian Decision Problems , 1967, Oper. Res..

[6] M. Degroot. Optimal Statistical Decisions , 1970 .

[7] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .

[8] S. Ross. Quality Control under Markovian Deterioration , 1971 .

[9] Ronald A. Howard,et al. Dynamic Probabilistic Systems , 1971 .

[10] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[11] Donald B. Rosenfield,et al. Markovian Deterioration with Uncertain Information , 1976, Oper. Res..