Adaptive Elicitation of Preferences under Uncertainty in Sequential Decision Making Problems

This paper aims to introduce an adaptive preference elicitation method for interactive decision support in sequential decision problems. The Decision Maker's preferences are assumed to be representable by an additive utility, initially unknown or imperfectly known. We first study the determination of possibly optimal policies when admissible utilities are imprecisely defined by some linear constraints derived from observed preferences. Then, we introduce a new approach interleaving elicitation of utilities and backward induction to incrementally determine a near-optimal policy. We propose an interactive algorithm with performance guarantees and describe numerical tests demonstrating the practical efficiency of our approach.

[1]  Shimon Whiteson,et al.  Linear support for multi-objective coordination graphs , 2014, AAMAS.

[2]  Nic Wilson,et al.  Multi-Objective Constraint Optimization with Tradeoffs , 2013, CP.

[3]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[4]  Eyke Hüllermeier,et al.  Preference Learning: An Introduction , 2010, Preference Learning.

[5]  Andrew P. Sage,et al.  A model of multiattribute decisionmaking and trade-off weight determination under uncertainty , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  P. Hammond Consequentialist foundations for expected utility , 1988 .

[7]  Craig Boutilier,et al.  Constraint-based optimization and utility elicitation using the minimax decision criterion , 2006, Artif. Intell..

[8]  Stephan Freytag An Introduction To Splines For Use In Computer Graphics And Geometric Modeling , 2016 .

[9]  Peter C. Fishburn,et al.  Theory and decision , 2003 .

[10]  Christian Borgelt,et al.  Computational Intelligence , 2016, Texts in Computer Science.

[11]  Patrice Perny,et al.  Incremental Preference Elicitation for Decision Making Under Risk with the Rank-Dependent Utility Model , 2016, UAI.

[12]  Craig Boutilier,et al.  Minimax regret based elicitation of generalized additive utilities , 2007, UAI.

[13]  Ronen I. Brafman,et al.  Preference‐Based Constrained Optimization with CP‐Nets , 2004, Comput. Intell..

[14]  Patrice Perny,et al.  On preference-based search in state space graphs , 2002, AAAI/IAAI.

[15]  Jacqueline Grennon , 2nd Ed. , 2002, The Journal of nervous and mental disease.

[16]  Peter Haddawy,et al.  Problem-Focused Incremental Elicitation of Multi-Attribute Utility Models , 1997, UAI.

[17]  Jérôme Lang,et al.  Voting procedures with incomplete preferences , 2005 .

[18]  Toby Walsh,et al.  Elicitation strategies for soft constraint problems with missing preferences: Properties, algorithms and experimental studies , 2010, Artif. Intell..

[19]  Daphne Koller,et al.  Making Rational Decisions Using Adaptive Utility Elicitation , 2000, AAAI/IAAI.

[20]  Daphne Koller,et al.  Learning an Agent's Utility Function by Observing Behavior , 2001, ICML.

[21]  Craig Boutilier,et al.  Regret-based Reward Elicitation for Markov Decision Processes , 2009, UAI.

[22]  Patrice Perny,et al.  Incremental Weight Elicitation for Multiobjective State Space Search , 2015, AAAI.

[23]  Bruno Zanuttini,et al.  Interactive Value Iteration for Markov Decision Processes with Unknown Rewards , 2013, IJCAI.

[24]  Patrice Perny,et al.  Combining Preference Elicitation and Search in Multiobjective State-Space Graphs , 2015, IJCAI.

[25]  Felix A. Fischer,et al.  Possible and necessary winners of partial tournaments , 2012, AAMAS.

[26]  Fabio Gagliardi Cozman,et al.  Sequential decision making with partially ordered preferences , 2011, Artif. Intell..

[27]  Ronen I. Brafman,et al.  Finding the Next Solution in Constraint- and Preference-Based Knowledge Representation Formalisms , 2010, KR.

[28]  Pallab Dasgupta,et al.  Utility of Pathmax in Partial Order Heuristic Search , 1995, Inf. Process. Lett..

[29]  Craig Boutilier,et al.  Elicitation and Approximately Stable Matching with Partial Preferences , 2013, IJCAI.

[30]  Greg Hines,et al.  Preference elicitation for risky prospects , 2010, AAMAS.

[31]  Vincent Conitzer,et al.  Determining Possible and Necessary Winners under Common Voting Rules Given Partial Orders , 2008, AAAI.

[32]  Craig Boutilier,et al.  Incremental utility elicitation with minimax regret decision criterion , 2003, IJCAI 2003.

[33]  Craig Boutilier,et al.  Computational Decision Support: Regret-based Models for Optimization and Preference Elicitation , 2012 .

[34]  A. Copeland Review: John von Neumann and Oskar Morgenstern, Theory of games and economic behavior , 1945 .