Learning Lexicographic Preference Models

Lexicographic preference models (LPMs) are one of the simplest yet most commonly used preference representations. In this chapter, we formally define LPMs and present learning algorithms for mining these models from data. In particular, we study a greedy algorithm that produces a “best guess” LPM that is consistent with the observations and two voting-based algorithms that approximate the target using the votes of a collection of consistent LPMs. In addition to our theoretical analyses of these algorithms, we empirically evaluate their performance under different conditions. Our results show that voting algorithms outperform the greedy method when the data is noise-free. The dominance is more significant when the training data is scarce. However, the performance of the voting algorithms quickly decays with even a little noise, whereas the greedy algorithm is more robust. Inspired by this result, we adapt one of the voting methods to consider the amount of noise in an environment and empirically show that the modified voting algorithm performs as well as the greedy approach even with noisy observations. We also introduce an intuitive yet powerful learning bias to prune some of the possible LPMs. We demonstrate how this learning bias can be used with variable and model voting and show that the learning bias improves learning performance significantly, especially when the number of observations is small.

[1]  G. Dantzig,et al.  Notes on Linear Programming: Part 1. The Generalized Simplex Method for Minimizing a Linear Form under Linear Inequality Restraints , 1954 .

[2]  Peter C. Fishburn,et al.  LEXICOGRAPHIC ORDERS, UTILITIES AND DECISION RULES: A SURVEY , 1974 .

[3]  R. Dawes Judgment under uncertainty: The robust beauty of improper linear models in decision making , 1979 .

[4]  J. Ford,et al.  Process tracing methods: Contributions, problems, and neglected research questions , 1989 .

[5]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[6]  Mirjam R. M. Westenberg,et al.  Multi-attribute evaluation process: Methodological and conceptual issues , 1994 .

[7]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[8]  Jonathan A. Stirk,et al.  Singleton bias and lexicographic preferences among equally valued alternatives , 1999 .

[9]  A. Quesada Negative results in the theory of games with lexicographic utilities , 2003 .

[10]  Ronen I. Brafman,et al.  CP-nets: A Tool for Representing and Reasoning withConditional Ceteris Paribus Preference Statements , 2011, J. Artif. Intell. Res..

[11]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[12]  Michael Schmitt,et al.  On the Complexity of Learning Lexicographic Strategies , 2006, J. Mach. Learn. Res..

[13]  József Dombi,et al.  Learning lexicographic orders , 2007, Eur. J. Oper. Res..

[14]  Jude W. Shavlik,et al.  Relational Macros for Transfer in Reinforcement Learning , 2007, ILP.

[15]  Thomas J. Walsh,et al.  Democratic approximation of lexicographic preference models , 2011, Artif. Intell..