论文信息 - Discounted Markov decision processes with utility constraints

Discounted Markov decision processes with utility constraints

We consider utility-constrained Markov decision processes. The expected utility of the total discounted reward is maximized subject to multiple expected utility constraints. By introducing a corresponding Lagrange function, a saddle-point theorem of the utility constrained optimization is derived. The existence of a constrained optimal policy is characterized by optimal action sets specified with a parametric utility.

[1] Mordecai Avriel,et al. Nonlinear programming , 1976 .

[2] M. Bouakiz,et al. Target-level criterion in Markov decision processes , 1995 .

[3] M. J. Sobel,et al. Discounted MDP's: distribution functions and exponential utility maximization , 1987 .

[4] Masami Kurano,et al. Constrained markov decision processes with compact state and action spaces: the average case , 2000 .

[5] M. Kurano,et al. On the general utility of discounted Markov decision processes , 1998 .

[6] E. Altman. Constrained Markov Decision Processes , 1999 .

[7] Jonathan M. Borwein,et al. On Fan's minimax theorem , 1986, Math. Program..

[8] D. White. Minimizing a Threshold Probability in Discounted Markov Decision Processes , 1993 .

[9] Victor R. Preedy,et al. Analysis and Methods , 2008 .

[10] D. Luenberger. Optimization by Vector Space Methods , 1968 .

[11] S. C. Jaquette. Markov Decision Processes with a New Optimality Criterion: Discrete Time , 1973 .

[12] R. Howard,et al. Risk-Sensitive Markov Decision Processes , 1972 .

[13] Uriel G. Rothblum,et al. Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..

[14] Linn I. Sennott,et al. Constrained Discounted Markov Decision Chains , 1991, Probability in the Engineering and Informational Sciences.

[15] Masami Yasuda,et al. A UTILITY DEVIATION IN DISCOUNTED MARKOV DECISION PROCESSES WITH GENERAL UTILITY , 1996 .

[16] V. Borkar. Topics in controlled Markov chains , 1991 .