Risk-Sensitive Planning*

Methods for planning in stochastic domains often aim for finding plans that minimize expected execution cost or maximize the probability of goal achievement. Researchers have largely ignored the question how to incorporate risk-sensitive attitudes into their planning mechanisms. Since utility theory shows that it can be rational to maximize expected utility, one might believe that by replacing all costs with their respective utilities (for an appropriate utility function) one could achieve risk-sensitive attitudes without having to change the existing probabilistic planning methods. Unfortunately, we show that this is usually not the case and, moreover, that the best action in a state can depend on the total cost that the agent has already accumulated. However, we demonstrate how one can transform risksensitive planning problems into equivalent ones for riskneutral agents provided that utility functions with the delta property are used. The transformation of a riskseeking planning problem can then be solved with any AI planning algorithm that either minimizes (or sarisrices) expected execution cost or, equivalently, one that maximizes (or satisrices) the probability of goal achievement. Thus, one can extend the functionality of these planners to risk-sensitive planning.

[1]  Matthias Heger,et al.  Consideration of Risk in Reinforcement Learning , 1994, ICML.

[2]  John L. Bresina,et al.  Anytime Synthetic Projection: Maximizing the Probability of Goal Satisfaction , 1990, AAAI.

[3]  H. Frederic Bohnenblust,et al.  The Theory of Games , 1950 .

[4]  Oren Etzioni,et al.  Embedding Decision-Analytic Control in a Learning Architecture , 1991, Artif. Intell..

[5]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Planning , 1995, Artif. Intell..

[6]  Peter Haddawy,et al.  Representations for Decision-Theoretic Planning: Utility Functions for Deadline Goals , 1992, KR.

[7]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[8]  Andrew W. Moore,et al.  The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[9]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[10]  Sven Koenig,et al.  Utility-Based Planning , 1993 .

[11]  Jerzy A. Filar,et al.  Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..

[12]  Thomas L. Dean,et al.  A Model for Projection and Action , 1989, IJCAI.

[13]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[14]  Stuart J. Russell,et al.  Do the right thing - studies in limited rationality , 1991 .

[15]  Richard Goodwin,et al.  Rational Handling of Multiple Goals for Mobile Robots , 1992 .

[16]  Howard Raiffa,et al.  Decision analysis: introductory lectures on choices under uncertainty. 1968. , 1969, M.D.Computing.

[17]  Jon Doyle,et al.  Modular utility representation for decision-theoretic planning , 1992 .

[18]  J. Pratt RISK AVERSION IN THE SMALL AND IN THE LARGE11This research was supported by the National Science Foundation (grant NSF-G24035). Reproduction in whole or in part is permitted for any purpose of the United States Government. , 1964 .