Adaptive Sampling for Coarse Ranking

We consider the problem of active coarse ranking, where the goal is to sort items according to their means into clusters of pre-specified sizes, by adaptively sampling from their reward distributions. This setting is useful in many social science applications involving human raters and the approximate rank of every item is desired. Approximate or coarse ranking can significantly reduce the number of ratings required in comparison to the number needed to find an exact ranking. We propose a computationally efficient PAC algorithm LUCBRank for coarse ranking, and derive an upper bound on its sample complexity. We also derive a nearly matching distribution-dependent lower bound. Experiments on synthetic as well as real-world data show that LUCBRank performs better than state-of-the-art baseline methods, even when these methods have the advantage of knowing the underlying parametric model.

[1]  Arun Rajkumar,et al.  A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data , 2014, ICML.

[2]  L. Thurstone A law of comparative judgment. , 1994 .

[3]  A. Culyer Thurstone’s Law of Comparative Judgment , 2014 .

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Sébastien Bubeck,et al.  Multiple Identifications in Multi-Armed Bandits , 2012, ICML.

[6]  Devavrat Shah,et al.  Rank Centrality: Ranking from Pairwise Comparisons , 2012, Oper. Res..

[7]  Adrienne Wood,et al.  Towards a social functional account of laughter: Acoustic features convey reward, affiliation, and dominance , 2017, PloS one.

[8]  Ambuj Tewari,et al.  PAC Subset Selection in Stochastic Multi-armed Bandits , 2012, ICML.

[9]  Ramesh Raskar,et al.  Streetscore -- Predicting the Perceived Safety of One Million Streetscapes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Jian Li,et al.  Practical Algorithms for Best-K Identification in Multi-Armed Bandits , 2017, ArXiv.

[11]  Martin J. Wainwright,et al.  Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues , 2015, IEEE Transactions on Information Theory.

[12]  Jian Li,et al.  Nearly Instance Optimal Sample Complexity Bounds for Top-k Arm Selection , 2017, AISTATS.

[13]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[14]  Shivani Agarwal,et al.  On Ranking and Choice Models , 2016, IJCAI.

[15]  Nihar B. Shah,et al.  Active ranking from pairwise comparisons and when parametric assumptions do not help , 2016, The Annals of Statistics.

[16]  Shie Mannor,et al.  Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..

[17]  Lalit Jain,et al.  NEXT: A System for Real-World Development, Evaluation, and Application of Active Learning , 2015, NIPS.

[18]  Ramesh Raskar,et al.  Deep Learning the City: Quantifying Urban Perception at a Global Scale , 2016, ECCV.

[19]  Aurélien Garivier,et al.  The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond , 2011, COLT.

[20]  Arun Rajkumar,et al.  Ranking from Stochastic Pairwise Preferences: Recovering Condorcet Winners and Tournament Solution Sets at the Top , 2015, ICML.

[21]  Arpit Agarwal,et al.  Learning with Limited Rounds of Adaptivity: Coin Tossing, Multi-Armed Bandits, and Ranking from Pairwise Comparisons , 2017, COLT.

[22]  Shivaram Kalyanakrishnan,et al.  Information Complexity in Bandit Subset Selection , 2013, COLT.

[23]  Aurélien Garivier,et al.  On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models , 2014, J. Mach. Learn. Res..

[24]  Stefano Ermon,et al.  Best arm identification in multi-armed bandits with delayed feedback , 2018, AISTATS.

[25]  Nir Ailon,et al.  An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity , 2010, J. Mach. Learn. Res..

[26]  Devavrat Shah,et al.  Iterative ranking from pair-wise comparisons , 2012, NIPS.

[27]  Matthias Grossglauser,et al.  Just Sort It! A Simple and Effective Approach to Active Preference Learning , 2015, ICML.

[28]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[29]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[30]  Svante Janson,et al.  Sorting with unreliable comparisons: a probabilistic analysis , 2003 .

[31]  Mark Braverman,et al.  Sorting from Noisy Information , 2009, ArXiv.

[32]  Martin J. Wainwright,et al.  Active Ranking from Pairwise Comparisons and the Futility of Parametric Assumptions , 2016, ArXiv.

[33]  Nebojsa Jojic,et al.  Efficient Ranking from Pairwise Comparisons , 2013, ICML.

[34]  Eli Upfal,et al.  Computing with Noisy Information , 1994, SIAM J. Comput..

[35]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[36]  Eyke Hüllermeier,et al.  A Survey of Preference-Based Online Learning with Bandit Algorithms , 2014, ALT.

[37]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[38]  Robert D. Nowak,et al.  Sparse Dueling Bandits , 2015, AISTATS.

[39]  Alon Orlitsky,et al.  Maximum Selection and Ranking under Noisy Comparisons , 2017, ICML.

[40]  Robert D. Nowak,et al.  Active Ranking using Pairwise Comparisons , 2011, NIPS.