Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains

This note concerns controlled Markov chains on a denumerable sate space. The performance of a control policy is measured by the risk-sensitive average criterion, and it is assumed that (a) the simultaneous Doeblin condition holds, and (b) the system is communicating under the action of each stationary policy. If the cost function is bounded below, it is established that the optimal average cost is characterized by an optimality inequality, and it is to shown that, even for bounded costs, such an inequality may be strict at every state. Also, for a nonnegative cost function with compact support, the existence an uniqueness of bounded solutions of the optimality equation is proved, and an example is provided to show that such a conclusion generally fails when the cost is negative at some state.

[1]  S. C. Jaquette A Utility Criterion for Markov Decision Processes , 1976 .

[2]  S. Marcus,et al.  Existence of Risk-Sensitive Optimal Stationary Policies for Controlled Markov Processes , 1999 .

[3]  S. Marcus,et al.  Risk sensitive control of Markov processes in countable state space , 1996 .

[4]  L. Sennott A new condition for the existence of optimal stationary policies in average cost Markov decision processes , 1986 .

[5]  Daniel E. Miller,et al.  Internal behaviour provided by nonlinear l1-optimal controllers , 2000 .

[6]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[7]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8]  Rhodes,et al.  Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games , 1973 .

[9]  Lukasz Stettner,et al.  Infinite Horizon Risk Sensitive Control of Discrete Time Markov Processes under Minorization Property , 2007, SIAM J. Control. Optim..

[10]  Daniel Hernández-Hernández,et al.  A characterization of exponential functionals in finite Markov chains , 2004, Math. Methods Oper. Res..

[11]  L. Sennott Another set of conditions for average optimality in Markov control processes , 1995 .

[12]  O. Hernández-Lerma Adaptive Markov Control Processes , 1989 .

[13]  R. Cavazos-Cadena Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains , 1988 .

[14]  Ł. Stettner,et al.  Infinite horizon risk sensitive control of discrete time Markov processes with small risk , 2000 .

[15]  Sean P. Meyn,et al.  Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost , 2002, Math. Oper. Res..

[16]  M. K. Ghosh,et al.  Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[17]  Rolando Cavazos-Cadena,et al.  Controlled Markov chains with risk-sensitive criteria: Average cost, optimality equations, and optimal solutions , 1999, Math. Methods Oper. Res..

[18]  Lea Friedman,et al.  The properties of a stochastic optimal sequential strategy for purchasing information in a project selection problem , 1996, Math. Methods Oper. Res..

[19]  Lukasz Stettner,et al.  Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon , 1999, SIAM J. Control. Optim..

[20]  S. C. Jaquette Markov Decision Processes with a New Optimality Criterion: Discrete Time , 1973 .

[21]  W. Fleming,et al.  Risk-Sensitive Control on an Infinite Time Horizon , 1995 .

[22]  Michel Loève,et al.  Probability Theory I , 1977 .