论文信息 - Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds

The expected return is a widely used objective in decision making under uncertainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw connections to previously proposed objectives for risk-aware planing: minmax, exponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for optimizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.

Pieter Abbeel | Teodor Mihai Moldovan | P. Abbeel

[1] J. K. Satia,et al. Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..

[2] R. Durrett. Probability: Theory and Examples , 1993 .

[3] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.

[4] M. Bouakiz,et al. Target-level criterion in Markov decision processes , 1995 .

[5] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[6] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .

[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[8] Congbin Wu,et al. Minimizing risk models in Markov decision processes with policies depending on target values , 1999 .

[9] Phhilippe Jorion. Value at Risk: The New Benchmark for Managing Financial Risk , 2000 .

[10] Steven D. Levitt,et al. On Modeling Risk in Markov Decision Processes , 2001 .

[11] Sean P. Meyn,et al. Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost , 2002, Math. Oper. Res..