Efficient Bayesian Inference for Generalized Bradley–Terry Models

The Bradley–Terry model is a popular approach to describe probabilities of the possible outcomes when elements of a set are repeatedly compared with one another in pairs. It has found many applications including animal behavior, chess ranking, and multiclass classification. Numerous extensions of the basic model have also been proposed in the literature including models with ties, multiple comparisons, group comparisons, and random graphs. From a computational point of view, Hunter has proposed efficient iterative minorization-maximization (MM) algorithms to perform maximum likelihood estimation for these generalized Bradley–Terry models whereas Bayesian inference is typically performed using Markov chain Monte Carlo algorithms based on tailored Metropolis–Hastings proposals. We show here that these MM algorithms can be reinterpreted as special instances of expectation-maximization algorithms associated with suitable sets of latent variables and propose some original extensions. These latent variables allow us to derive simple Gibbs samplers for Bayesian inference. We demonstrate experimentally the efficiency of these algorithms on a variety of applications.

[1]  N. Shephard,et al.  Non‐Gaussian Ornstein–Uhlenbeck‐based models and some of their uses in financial economics , 2001 .

[2]  Ove Frank,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[3]  P. Diaconis Group representations in probability and statistics , 1988 .

[4]  D. Hunter MM algorithms for generalized Bradley-Terry models , 2003 .

[5]  E. Adams Bayesian analysis of linear dominance hierarchies , 2005, Animal Behaviour.

[6]  Chih-Jen Lin,et al.  Generalized Bradley-Terry Models and Multi-Class Probability Estimates , 2006, J. Mach. Learn. Res..

[7]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[8]  Carl E. Rasmussen,et al.  A choice model with infinitely many latent features , 2006, ICML.

[9]  Xiao-Li Meng,et al.  [Optimization Transfer Using Surrogate Objective Functions]: Discussion , 2000 .

[10]  Christian Schmid,et al.  A Matlab function to estimate choice model parameters from paired-comparison data , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[11]  Allan Sly,et al.  Random graphs with a given degree sequence , 2010, 1005.1136.

[12]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[13]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[14]  Roger R. Davidson,et al.  A Bibliography on the Method of Paired Comparisons , 1973 .

[15]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[16]  P. Damlen,et al.  Gibbs sampling for Bayesian non‐conjugate and hierarchical models by using auxiliary variables , 1999 .

[17]  I. C. Gormley,et al.  Exploring Voting Blocs Within the Irish Electorate , 2008 .

[18]  A. Tversky Choice by elimination , 1972 .

[19]  P. V. Rao,et al.  Ties in Paired-Comparison Experiments: A Generalization of the Bradley-Terry Model , 1967 .

[20]  P. Holland,et al.  An Exponential Family of Probability Distributions for Directed Graphs , 1981 .

[21]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[22]  P. Moran On the method of paired comparisons. , 1947, Biometrika.

[23]  I. C. Gormley,et al.  Exploring Voting Blocs Within the Irish Electorate , 2008 .

[24]  R. Plackett The Analysis of Permutations , 1975 .

[25]  I. C. Gormley,et al.  A grade of membership model for rank data , 2009 .

[26]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[27]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[28]  A. Tversky Elimination by aspects: A theory of choice. , 1972 .

[29]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[30]  E. Zermelo Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung , 1929 .

[31]  R. Luce,et al.  The Choice Axiom after Twenty Years , 1977 .

[32]  Alan Agresti,et al.  Categorical Data Analysis , 2003 .

[33]  M. Newman,et al.  Statistical mechanics of networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.