Computing rank dependent utility in graphical models for sequential decision problems

This paper is devoted to automated sequential decision in AI. More precisely, we focus here on the Rank Dependent Utility (RDU) model. This model is able to encompass rational decision behaviors that the Expected Utility model cannot accommodate. However, the non-linearity of RDU makes it difficult to compute an RDU-optimal strategy in sequential decision problems. This has considerably slowed the use of RDU in operational contexts. In this paper, we are interested in providing new algorithmic solutions to compute an RDU-optimal strategy in graphical models. Specifically, we present algorithms for solving decision tree models and influence diagram models of sequential decision problems. For decision tree models, we propose a mixed integer programming formulation that is valid for a subclass of RDU models (corresponding to risk seeking behaviors). This formulation reduces to a linear program when mixed strategies are considered. In the general case (i.e., when there is no particular assumption on the parameters of RDU), we propose a branch and bound procedure to compute an RDU-optimal strategy among the pure ones. After highlighting the difficulties induced by the use of RDU in influence diagram models, we show how this latter procedure can be extended to optimize RDU in an influence diagram. Finally, we provide empirical evaluations of all the presented algorithms.

[1]  Steven Vajda,et al.  Games and Decisions. By R. Duncan Luce and Howard Raiffa. Pp. xi, 509. 70s. 1957. (J Wiley & Sons) , 1959, The Mathematical Gazette.

[2]  Bertrand Munier,et al.  Risk, decision and rationality , 1988 .

[3]  J. Neumann,et al.  Theory of Games and Economic Behavior. , 1945 .

[4]  A. Tversky,et al.  Prospect theory: an analysis of decision under risk — Source link , 2007 .

[5]  M. Abdellaoui Parameter-Free Elicitation of Utility and Probability Weighting Functions , 2000 .

[6]  V. Torra,et al.  Weighted OWA operators for synthesis of information , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[7]  Didier Dubois,et al.  Decision-theoretic foundations of qualitative possibility theory , 2001, Eur. J. Oper. Res..

[8]  Sven Koenig,et al.  Risk-Sensitive Planning* , 1994 .

[9]  R. Howard,et al.  Risk-Sensitive Markov Decision Processes , 1972 .

[10]  Peter J. Hammond,et al.  Consequentialism and the Independence Axiom , 1988 .

[11]  Jim Blythe,et al.  Decision-Theoretic Planning , 1999, AI Mag..

[12]  Wlodzimierz Ogryczak,et al.  On Decision Support Under Risk by the WOWA Optimization , 2007, ECSQARU.

[13]  Leslie Pack Kaelbling,et al.  Planning With Deadlines in Stochastic Domains , 1993, AAAI.

[14]  Sven Koenig,et al.  Risk-Sensitive Planning with One-Switch Utility Functions: Value Iteration , 2005, AAAI.

[15]  J. Quiggin Generalized expected utility theory , 1992 .

[16]  Wlodzimierz Ogryczak,et al.  WOWA Enhancement of the Preference Modeling in the Reference Point Method , 2008, MDAI.

[17]  Wlodzimierz Ogryczak,et al.  On efficient WOWA optimization for decision support under risk , 2009, Int. J. Approx. Reason..

[18]  Ernesto Damiani,et al.  A WOWA-based Aggregation Technique on Trust Values Connected to Metadata , 2005, STM.

[19]  Howard Raiffa,et al.  Decision analysis: introductory lectures on choices under uncertainty. 1968. , 1969, M.D.Computing.

[20]  P. Hammond Consequentialist foundations for expected utility , 1988 .

[21]  Ronald A. Howard,et al.  Influence Diagrams , 2005, Decis. Anal..

[22]  E. McClennen Rationality and Dynamic Choice: Foundational Explorations , 1996 .

[23]  Patrice Perny,et al.  State Space Search for Risk-Averse Agents , 2007, IJCAI.

[24]  D. L. Hanson,et al.  ON THE THEORY OF RISK AVERSION , 1970 .

[25]  Howard Raiffa,et al.  Decision Analysis: Introductory Lectures on Choices Under Uncertainty , 1968 .

[26]  Sven Koenig,et al.  Existence and Finiteness Conditions for Risk-Sensitive Planning: Results and Conjectures , 2005, UAI.

[27]  J. Quiggin Generalized expected utility theory : the rank-dependent model , 1994 .

[28]  T. Morin Monotonicity and the principle of optimality , 1982 .

[29]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[30]  M. Machina Dynamic Consistency and Non-expected Utility Models of Choice under Uncertainty , 1989 .

[31]  A. Stuart,et al.  Portfolio Selection: Efficient Diversification of Investments , 1959 .

[32]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[33]  Justo Puerto,et al.  Dynamic programming analysis of the TV game "Who wants to be a millionaire?" , 2007, Eur. J. Oper. Res..

[34]  Ross D. Shachter Evaluating Influence Diagrams , 1986, Oper. Res..

[35]  M. Allais Le comportement de l'homme rationnel devant le risque : critique des postulats et axiomes de l'ecole americaine , 1953 .

[36]  Frank Jensen,et al.  From Influence Diagrams to junction Trees , 1994, UAI.

[37]  M. Allais,et al.  An outline of my main contributions to economic science , 1988 .

[38]  Thomas D. Nielsen,et al.  Learning a decision maker's utility function from (possibly) inconsistent behavior , 2004, Artif. Intell..

[39]  Sven Koenig,et al.  An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions , 2008, AAMAS.

[40]  Colin Camerer,et al.  Violations of the betweenness axiom and nonlinearity in probability , 1994 .

[41]  E. McClennen Rationality and dynamic choice , 1990 .

[42]  A. Tversky,et al.  Advances in prospect theory: Cumulative representation of uncertainty , 1992 .

[43]  Richard Gonzalez,et al.  On the Shape of the Probability Weighting Function , 1999, Cognitive Psychology.

[44]  J. Neumann,et al.  Theory of Games and Economic Behavior. , 1945 .

[45]  Uday S. Karmarkar,et al.  Subjectively weighted utility and the Allais Paradox , 1979 .

[46]  Howard Raiffa,et al.  Games And Decisions , 1958 .

[47]  M. Rothschild,et al.  Increasing risk: I. A definition , 1970 .

[48]  Jean-Yves Jaffray,et al.  Dynamic decision making without expected utility: An operational approach , 2006, Eur. J. Oper. Res..

[49]  Daniel Kahneman,et al.  Advances in prospect theory: Cumulative representation of , 1992 .

[50]  V. Torra The weighted OWA operator , 1997, International Journal of Intelligent Systems.

[51]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[52]  A. Tversky,et al.  Prospect theory: analysis of decision under risk , 1979 .

[53]  Jagdish Handa,et al.  Risk, Probabilities, and a New Theory of Cardinal Utility , 1977, Journal of Political Economy.

[54]  Gildas Jeantet,et al.  Rank-Dependent Probability Weighting in Sequential Decision Problems under Uncertainty , 2008, ICAPS.

[55]  J. Pratt RISK AVERSION IN THE SMALL AND IN THE LARGE11This research was supported by the National Science Foundation (grant NSF-G24035). Reproduction in whole or in part is permitted for any purpose of the United States Government. , 1964 .

[56]  A. Chateauneuf Comonotonicity axioms and rank-dependent expected utility theory for arbitrary consequences , 1999 .

[57]  Thomas D. Nielsen,et al.  An operational approach to rational decision making based on rank dependent utility , 2001 .

[58]  Vicenç Torra,et al.  The weighted OWA operator , 1997, Int. J. Intell. Syst..

[59]  Vicenç Torra,et al.  The WOWA operator and the interpolation function W*: Chen and Otto's interpolation method revisited , 2000, Fuzzy Sets Syst..

[60]  Ronald A. Howard,et al.  Readings on the Principles and Applications of Decision Analysis , 1989 .

[61]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..