A novel method for estimating the common signals for consensus across multiple ranked lists

The ranking of objects, such as journals, institutions or biological entities, is broadly used to assess the relative quality or relevance of such objects. A multiple ranking is performed by a number of assessors (humans or machines) and inference about the nature of the observed rankings is desirable for evaluation, business or scientific purposes. The assessors’ decisions are based on some inherent metric scale and depend on judgement and discriminatory ability, data to which we usually do not have access. An indirect inference approach is proposed that allows one to estimate those signal parameters that might be causal for the observed rankings obtained from several assessors, some of which may not necessarily provide the same decision quality. The order of the values represents a consensus ranking across the observed individual rankings. The standard errors of the estimated signal parameters are obtained through a non-parametric bootstrap. Hence, the signal variability can be evaluated object-wise for the purpose of quantifying the stability of the associated rank positions. As a result, such signal estimates can be used in the meta-analysis of conceptually similar evaluation exercises, studies or experiments, and in any data integration task where measurements on the metric scale are either unavailable, or not directly comparable. The suggested approach is validated on simulated rank data as well as on experimental rank data from current molecular medicine. The proposed algorithms were implemented and all calculations performed in the R environment. The source code is provided.

[1]  D. Sculley,et al.  Rank Aggregation for Similar Items , 2007, SDM.

[2]  Ruth Etzioni,et al.  Combining Results of Microarray Experiments: A Rank Aggregation Approach , 2006 .

[3]  John Guiver,et al.  Bayesian inference for Plackett-Luce ranking models , 2009, ICML '09.

[4]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[5]  Michael G. Schimek,et al.  Moderate-Deviation-Based Inference for Random Degeneration in Paired Rank Lists , 2012 .

[6]  P.-C.-F. Daunou,et al.  Mémoire sur les élections au scrutin , 1803 .

[7]  Peter Hall,et al.  Using the bootstrap to quantify the authority of an empirical ranking , 2009, 0911.3749.

[8]  Johanna Hardin,et al.  A note on oligonucleotide expression values not being normally distributed. , 2009, Biostatistics.

[9]  Shili Lin,et al.  TopKLists: a comprehensive R package for statistical inference, stochastic aggregation, and visualization of multiple omics ranked lists , 2015, Statistical applications in genetics and molecular biology.

[10]  Eva Budinska,et al.  An Inference and Integration Approach for the Consolidation of Ranked Lists , 2012, Commun. Stat. Simul. Comput..

[11]  Hugh Miller,et al.  Modeling the variability of rankings , 2010, 1011.2354.

[12]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[13]  Rainer Breitling,et al.  Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments , 2004, FEBS letters.

[14]  Joseph S. Verducci,et al.  Detecting the end of agreement between two long ranked lists , 2013, Stat. Anal. Data Min..

[15]  Lin Shili Space Oriented Rank-Based Data Integration , 2010 .

[16]  C. L. Mallows NON-NULL RANKING MODELS. I , 1957 .

[17]  Matti Vihola,et al.  Robust adaptive Metropolis algorithm with coerced acceptance rate , 2010, Statistics and Computing.

[18]  J. Marden Analyzing and Modeling Rank Data , 1996 .

[19]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[20]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[21]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[22]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[23]  Igor Jurisica,et al.  Prioritizing Therapeutics for Lung Cancer: An Integrative Meta-analysis of Cancer Gene Signatures and Chemogenomic Data , 2015, PLoS Comput. Biol..

[24]  Cesare Furlanello,et al.  Algebraic stability indicators for ranked lists in molecular profiling , 2008, Bioinform..

[25]  Sven Laur,et al.  Robust rank aggregation for gene list integration and meta-analysis , 2012, Bioinform..

[26]  Jie Ding,et al.  Integration of Ranked Lists via Cross Entropy Monte Carlo with Applications to mRNA and microRNA Studies , 2009, Biometrics.

[27]  R. Plackett The Analysis of Permutations , 1975 .

[28]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[29]  J. Hintze,et al.  Violin plots : A box plot-density trace synergism , 1998 .

[30]  L. Thurstone A law of comparative judgment. , 1994 .

[31]  Valeria Vitelli,et al.  Probabilistic preference learning with the Mallows rank model , 2014, J. Mach. Learn. Res..

[32]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[33]  Shili Lin,et al.  Rank aggregation methods , 2010 .