Modelling preference data with the Wallenius distribution

The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the importance of the categories. We illustrate the performance of the estimation procedure on simulated datasets. Finally, we use the new model for analysing two datasets about movies ratings and Italian academic statisticians' journal preferences. The latter is a novel dataset collected by the authors.

[1]  Agner Fog,et al.  Sampling Methods for Wallenius' and Fisher's Noncentral Hypergeometric Distributions , 2008, Commun. Stat. Simul. Comput..

[2]  J. Marden Analyzing and Modeling Rank Data , 1996 .

[3]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[4]  M. Feldman,et al.  Population growth of human Y chromosomes: a study of Y chromosome microsatellites. , 1999, Molecular biology and evolution.

[5]  Agner Fog,et al.  Calculation Methods for Wallenius' Noncentral Hypergeometric Distribution , 2008, Commun. Stat. Simul. Comput..

[6]  C. Castillo-Chavez,et al.  Urn models and vaccine efficacy estimation. , 2000, Statistics in medicine.

[7]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[8]  Mayer Alvo,et al.  Statistical Methods for Ranking Data , 2014 .

[9]  Fabrizio Leisen,et al.  An Approximate Likelihood Perspective on ABC Methods , 2017, 1708.05341.

[10]  M. Alvo,et al.  Testing for Randomness, Agreement, and Interaction , 2014 .

[11]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[12]  M. Beaumont Approximate Bayesian Computation 1 Approximate Bayesian Computation in Evolution and Ecology , 2010 .

[13]  J. Chesson A non-central multivariate hypergeometric distribution arising from biased sampling with application to selective predation , 1976 .

[14]  David Allingham,et al.  Bayesian estimation of quantile distributions , 2009, Stat. Comput..

[15]  B. Manly A Model for Certain Types of Selection Experiments , 1974 .

[16]  R. Fisher,et al.  The Logic of Inductive Inference , 1935 .

[17]  R. Plackett The Analysis of Permutations , 1975 .

[18]  L. Thurstone A law of comparative judgment. , 1994 .

[19]  Jean-Michel Marin,et al.  Approximate Bayesian computational methods , 2011, Statistics and Computing.

[20]  Kenneth T. Wallenius,et al.  BIASED SAMPLING; THE NONCENTRAL HYPERGEOMETRIC PROBABILITY DISTRIBUTION , 1963 .

[21]  M. Ridout,et al.  Human–Tiger Conflict in Context: Risks to Lives and Livelihoods in the Bangladesh Sundarbans , 2013 .

[22]  P. R. Gillett,et al.  Monetary unit sampling: a belief-function implementation for audit and accounting applications , 2000, Int. J. Approx. Reason..

[23]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[24]  James O. Berger,et al.  Overall Objective Priors , 2015, 1504.02689.

[25]  M. Beaumont Approximate Bayesian Computation in Evolution and Ecology , 2010 .

[26]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[27]  P. Diaconis Group representations in probability and statistics , 1988 .

[28]  Edoardo M. Airoldi,et al.  Generalized Species Sampling Priors With Latent Beta Reinforcements , 2010, Journal of the American Statistical Association.

[29]  Fabrizio Leisen,et al.  Conditionally identically distributed species sampling sequences , 2008, Advances in Applied Probability.

[30]  Kui Zhang,et al.  Length bias correction for RNA-seq data in gene set analyses , 2011, Bioinform..

[31]  Yanan Fan,et al.  Handbook of Approximate Bayesian Computation , 2018 .