Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost

The existence of an optimal feedback law is established for the risk-sensitive optimal control problem with denumerable state space. The main assumptions imposed are irreducibility and anear monotonicity condition on the one-step cost function. A solution can be found constructively using either value iteration or policy iteration under suitable conditions on initial feedback law.

[1]  Rhodes,et al.  Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[2]  H. Teicher,et al.  Probability theory: Independence, interchangeability, martingales , 1978 .

[3]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[4]  Uriel G. Rothblum,et al.  Multiplicative Markov Decision Chains , 1984, Math. Oper. Res..

[5]  E. Nummelin General irreducible Markov chains and non-negative operators: List of symbols and notation , 1984 .

[6]  P. Glynn A Lyapunov Bound for Solutions of Poisson's Equation , 1989 .

[7]  P. Whittle Risk-Sensitive Optimal Control , 1990 .

[8]  Matthew R. James,et al.  Asymptotic analysis of nonlinear stochastic risk-sensitive control and differential games , 1992, Math. Control. Signals Syst..

[9]  W. Fleming,et al.  Risk sensitive optimal control and differential games , 1992 .

[10]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[11]  S.,et al.  Risk-Sensitive Control and Dynamic Games for Partially Observed Discrete-Time Nonlinear Systems , 1994 .

[12]  S. Marcus,et al.  Risk sensitive control of Markov processes in countable state space , 1996 .

[13]  Sean P. Meyn,et al.  A Liapounov bound for solutions of the Poisson equation , 1996 .

[14]  Sean P. Meyn The Policy Improvement Algorithm for Markov Decision Processes , 1997 .

[15]  W. Fleming,et al.  Risk sensitive control of finite state machines on an infinite horizon. I , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[16]  Sean P. Meyn The policy iteration algorithm for average reward Markov decision processes with general state space , 1997, IEEE Trans. Autom. Control..

[17]  Daniel Hernández-Hernández,et al.  Risk Sensitive Markov Decision Processes , 1997 .

[18]  Sean P. Meyn Algorithms for optimization and stabilization of controlled Markov chains , 1999 .

[19]  Rolando Cavazos-Cadena,et al.  Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions , 1999, Math. Methods Oper. Res..

[20]  L. Stettner,et al.  Risk sensitive control of discrete time partially observed Markov processes with infinite horizon , 1999 .

[21]  Lukasz Stettner,et al.  Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon , 1999, SIAM J. Control. Optim..

[22]  Sean P. Meyn,et al.  Value iteration and optimization of multiclass queueing networks , 1999, Queueing Syst. Theory Appl..

[23]  S. Meyn,et al.  Multiplicative ergodicity and large deviations for an irreducible Markov chain , 2000 .

[24]  S. Meyn,et al.  Spectral theory and limit theorems for geometrically ergodic Markov processes , 2002, math/0209200.