Shortest Stochastic Path with Risk Sensitive Evaluation

In an environment of uncertainty where decisions must be taken, how to make a decision considering the risk? The shortest stochastic path (SSP) problem models the problem of reaching a goal with the least cost. However under uncertainty, a best decision may: minimize expected cost, minimize variance, minimize worst case, maximize best case, etc. Markov Decision Processes (MDPs) defines optimal decision in the shortest stochastic path problem as the decision that minimizes expected cost, therefore MDPs does not care about the risk. An extension of MDP which has few works in Artificial Intelligence literature is Risk Sensitive MDP. RSMDPs considers the risk and integrates expected cost, variance, worst case and best case in a simple way. We show theoretically the differences and similarities between MDPs and RSMDPs for modeling the SSP problem, in special the relationship between the discount factor γ and risk prone attitudes under the SSP with constant cost. We also exemplify each model in a simple artificial scenario.

[1]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[2]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[3]  R. L. Keeney,et al.  Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[4]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[5]  John N. Tsitsiklis,et al.  Mean-Variance Optimization in Markov Decision Processes , 2011, ICML.

[6]  C. Starmer,et al.  Preference Anomalies, Preference Elicitation and the Discovered Preference Hypothesis , 2005 .

[7]  Shie Mannor,et al.  Percentile Optimization for Markov Decision Processes with Parameter Uncertainty , 2010, Oper. Res..

[8]  M. J. Sobel,et al.  Discounted MDP's: distribution functions and exponential utility maximization , 1987 .

[9]  Karel Sladký,et al.  Growth rates and average optimality in risk-sensitive Markov decision chains , 2008, Kybernetika.

[10]  Stephen D. Patek,et al.  On terminating Markov decision processes with a risk-averse objective function , 2001, Autom..

[11]  Fabio Gagliardi Cozman,et al.  Planning under Risk and Knightian Uncertainty , 2007, IJCAI.

[12]  R. Cavazos-Cadena,et al.  The Discounted Method and Equivalence of Average Criteria for Risk-Sensitive Markov Decision Processes on Borel Spaces , 2010 .

[13]  Sven Koenig,et al.  Probabilistic Planning with Nonlinear Utility Functions , 2006, ICAPS.

[14]  Bart Selman,et al.  Risk-Sensitive Policies for Sustainable Renewable Resource Allocation , 2011, IJCAI.

[15]  Sven Koenig,et al.  The interaction of representations and planning objectives for decision-theoretic planning tasks , 2002, J. Exp. Theor. Artif. Intell..

[16]  Fabio Gagliardi Cozman,et al.  Strong Probabilistic Planning , 2008, MICAI.

[17]  John K. Slaney,et al.  Decision-Theoretic Planning with non-Markovian Rewards , 2011, J. Artif. Intell. Res..

[18]  Evan L. Porteus On the Optimality of Structured Policies in Countable Stage Decision Processes , 1975 .

[19]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..