Processus de décision markoviens et préférences non classiques

The standard model of Markov decision processes implicitly relies on a preference structure induced by the existence of scalar and additive costs and the use of a certain criterion for policy evaluation (total, discounted, average...). This preference structure imposes strict hypotheses allowing the use of dynamic programming. We are interested here in Markov decision processes whose preference structure is non-classic and we give simple and sufficient properties on theses preferences for the use of methods based on dynamic programming. So these properties define a larger class of Markov decision processes solvable with dynamic programming techniques.

[1]  A. Tversky,et al.  Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .

[2]  Didier Dubois,et al.  Possibility Theory as a Basis for Qualitative Decision Theory , 1995, IJCAI.

[3]  Nic Wilson,et al.  An Order of Magnitude Calculus , 1995, UAI.

[4]  Blai Bonet,et al.  Qualitative MDPs and POMDPs: An Order-Of-Magnitude Approximation , 2002, UAI.

[5]  Didier Dubois,et al.  Decision-theoretic foundations of qualitative possibility theory , 2001, Eur. J. Oper. Res..

[6]  E. Rowland Theory of Games and Economic Behavior , 1946, Nature.

[7]  Patrice Perny,et al.  An Axiomatic Approach to Robustness in Search Problems with Multiple Scenarios , 2002, UAI.

[8]  Stella X. Yu,et al.  Optimization Models for the First Arrival Target Distribution Function in Discrete Time , 1998 .

[9]  M. J. Sobel Ordinal Dynamic Programming , 1975 .

[10]  J. Novák Linear programming in tector criterion markov and semi-Markov decision processes , 1989 .

[11]  Régis Sabbadin Une approche ordinale de la décision dans l'incertain : axiomatisation, représentation logique et application à la décision séquentielle , 1998 .

[12]  Paolo Ghirardato,et al.  Revisiting Savage in a conditional world , 2002 .

[13]  Denyse Baillargeon,et al.  Bibliographie , 1929 .

[14]  Rolando Cavazos-Cadena,et al.  Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces , 2000, Math. Methods Oper. Res..

[15]  Olivier Spanjaard,et al.  Exploitation de préférences non-classiques dans les problèmes combinatoires : modèles et algorithmes pour les graphes , 2003 .

[16]  P. Hammond Consequentialist foundations for expected utility , 1988 .

[17]  Didier Dubois,et al.  Making Decision in a Qualitative Setting: from Decision under Uncertaintly to Case-based Decision , 1998, KR.

[18]  Patrice Perny,et al.  On preference-based search in state space graphs , 2002, AAAI/IAAI.

[19]  Régis Sabbadin,et al.  A Possibilistic Model for Qualitative Sequential Decision Problems under Uncertainty in Partially Observable Environments , 1999, UAI.

[20]  Jérôme Lang,et al.  Towards qualitative approaches to multi-stage decision making , 1998, Int. J. Approx. Reason..

[21]  Henri Prade,et al.  Qualitative decision theory and multistage decision making : , 1996 .

[22]  M. I. Henig Vector-Valued Dynamic Programming , 1983 .

[23]  Dietmar Schweigert,et al.  Minimal paths on ordered graphs , 1999 .

[24]  K. Wakuta Vector-valued Markov decision processes and the systems of linear inequalities , 1995 .

[25]  K. Wakuta Optimal stationary policies in the vector-valued Markov decision process , 1992 .

[26]  D. White Multi-objective infinite-horizon discounted Markov decision processes , 1982 .

[27]  Evan L. Porteus,et al.  Dynamic Choice Theory and Dynamic Programming , 1979 .

[28]  M. Machina Dynamic Consistency and Non-expected Utility Models of Choice under Uncertainty , 1989 .

[29]  R. M. Adelson,et al.  Utility Theory for Decision Making , 1971 .