Risk-sensitive optimal control of hidden Markov models: structural results

The authors consider a risk-sensitive optimal control problem for (finite state and action spaces) hidden Markov models (HMM). They present results of an investigation on the nature and structure of risk-sensitive controllers for HMM. Several general structural results are presented, as well as a particular case study of a popular benchmark problem. For the latter, they obtain structural results for the optimal risk-sensitive controller and compare it to that of the risk-neutral controller. Furthermore, they show that indeed the risk-sensitive controller and its corresponding information state converge to the known solutions for the risk-neutral situation as the risk factor goes to zero. They also study the infinite and general risk aversion cases.

[1]  S. Vajda,et al.  GAMES AND DECISIONS; INTRODUCTION AND CRITICAL SURVEY. , 1958 .

[2]  A. N. Shiryaev On Markov Sufficient Statistics in Non-Additive Bayes Problems of Sequential Analysis , 1964 .

[3]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[4]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[5]  S. C. Jaquette A Utility Criterion for Markov Decision Processes , 1976 .

[6]  Chelsea C. White,et al.  A Markov Quality Control Process Subject to Partial Observation , 1977 .

[7]  C.C. White,et al.  Dynamic programming and stochastic control , 1978, Proceedings of the IEEE.

[8]  Ernst-Erich Doberkat Stochastic Automata: Stability, Nondeterminism, and Prediction , 1981, Lecture Notes in Computer Science.

[9]  A. Bensoussan,et al.  Optimal control of partially observable stochastic systems with an exponential-of-integral performance index , 1985 .

[10]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[11]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[12]  M. J. Sobel,et al.  Discounted MDP's: distribution functions and exponential utility maximization , 1987 .

[13]  William S. Lovejoy Technical Note - On the Convexity of Policy Regions in Partially Observed Systems , 1987, Oper. Res..

[14]  P. Whittle Risk-Sensitive Optimal Control , 1990 .

[15]  Ari Arapostathis,et al.  On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes , 1991, Ann. Oper. Res..

[16]  Matthew J. Sobel,et al.  Inventory Control with an Exponential Utility Criterion , 1992, Oper. Res..

[17]  J. Pratt RISK AVERSION IN THE SMALL AND IN THE LARGE11This research was supported by the National Science Foundation (grant NSF-G24035). Reproduction in whole or in part is permitted for any purpose of the United States Government. , 1964 .

[18]  Ari Arapostathis,et al.  Analysis of an adaptive control scheme for a partially observed controlled Markov chain , 1990 .

[19]  M. K. Ghosh,et al.  Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[20]  M. James,et al.  Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems , 1994, IEEE Trans. Autom. Control..

[21]  S.I. Marcus,et al.  Risk-sensitive optimal control of hidden Markov models: a case study , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[22]  S.,et al.  Risk-Sensitive Control and Dynamic Games for Partially Observed Discrete-Time Nonlinear Systems , 1994 .

[23]  M. James,et al.  Robust and Risk-Sensitive Output Feedback Control for Finite State Machines and Hidden Markov Models , 1994 .

[24]  Peter Whittle,et al.  Optimal Control: Basics and Beyond , 1996 .

[25]  Daniel Hernández-Hernández,et al.  Risk Sensitive Markov Decision Processes , 1997 .