Towards Optimal Estimation of Bivariate Isotonic Matrices with Unknown Permutations

Many applications, including rank aggregation, crowd-labeling, and graphon estimation, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and/or columns. We consider the problem of estimating an unknown matrix in this class, based on noisy observations of (possibly, a subset of) its entries. We design and analyze polynomial-time algorithms that improve upon the state of the art in two distinct metrics, showing, in particular, that minimax optimal, computationally efficient estimation is achievable in certain settings. Along the way, we prove matching upper and lower bounds on the minimax radii of certain cone testing problems, which may be of independent interest.

[1]  Chao Gao,et al.  Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels , 2013, 1310.5764.

[2]  Andrea Montanari,et al.  Computational Implications of Reducing Data to Sufficient Statistics , 2014, ArXiv.

[3]  John Le,et al.  Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution , 2010 .

[4]  Martin J. Wainwright,et al.  Breaking the 1/√n Barrier: Faster Rates for Permutation-based Models in Polynomial Time , 2018, COLT.

[5]  Jonathan Weed,et al.  Minimax Rates and Efficient Algorithms for Noisy Sorting , 2017, ALT.

[6]  Devavrat Shah,et al.  Reducing Crowdsourcing to Graphon Estimation, Statistically , 2017, AISTATS.

[7]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[8]  Chao Gao Phase Transitions in Approximate Ranking , 2017, 1711.11189.

[9]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[10]  N. Wilcox,et al.  Decisions, Error and Heterogeneity , 1997 .

[11]  Devavrat Shah,et al.  Budget-optimal crowdsourcing using low-rank matrix approximations , 2011, 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[12]  Sabyasachi Chatterjee,et al.  Estimation in Tournaments and Graphs Under Monotonicity Constraints , 2016, IEEE Transactions on Information Theory.

[13]  Martin J. Wainwright,et al.  Denoising linear models with permuted data , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[14]  Yuxin Chen,et al.  Spectral MLE: Top-K Rank Aggregation from Pairwise Comparisons , 2015, ICML.

[15]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS , 1952 .

[16]  Sjoerd Dirksen,et al.  Tail bounds via generic chaining , 2013, ArXiv.

[17]  Nihar B. Shah,et al.  Active ranking from pairwise comparisons and when parametric assumptions do not help , 2016, The Annals of Statistics.

[18]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[19]  Arnak S. Dalalyan,et al.  Minimax Rates in Permutation Estimation for Feature Matching , 2013, J. Mach. Learn. Res..

[20]  Hisashi Kashima,et al.  Accurate Integration of Crowdsourced Labels Using Workers' Self-reported Confidence Scores , 2013, IJCAI.

[21]  Arun Rajkumar,et al.  When can we rank well from comparisons of \(O(n\log(n))\) non-actively chosen pairs? , 2016, COLT.

[22]  Martin J. Wainwright,et al.  A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness , 2016, IEEE Transactions on Information Theory.

[23]  A. B. Kahn,et al.  Topological sorting of large networks , 1962, CACM.

[24]  V. Koltchinskii,et al.  Nuclear norm penalization and optimal rates for noisy low rank matrix completion , 2010, 1011.6256.

[25]  A. P. Dawid,et al.  Maximum Likelihood Estimation of Observer Error‐Rates Using the EM Algorithm , 1979 .

[26]  Aniket Kittur,et al.  Instrumenting the crowd: using implicit behavioral measures to predict task performance , 2011, UIST.

[27]  Martin J. Wainwright,et al.  The geometry of hypothesis testing over convex cones: Generalized likelihood tests and minimax radii , 2017, The Annals of Statistics.

[28]  P. Rigollet,et al.  Optimal rates of statistical seriation , 2016, Bernoulli.

[29]  Harrison H. Zhou,et al.  Rate-optimal graphon estimation , 2014, 1410.5837.

[30]  P. Wolfe,et al.  Nonparametric graphon estimation , 2013, 1309.5936.

[31]  Bruce E. Hajek,et al.  Minimax-optimal Inference from Partial Rankings , 2014, NIPS.

[32]  P. Fishburn Binary choice probabilities: on the varieties of stochastic transitivity , 1973 .

[33]  Cun-Hui Zhang Risk bounds in isotonic regression , 2002 .

[34]  Arpit Agarwal,et al.  Learning with Limited Rounds of Adaptivity: Coin Tossing, Multi-Armed Bandits, and Ranking from Pairwise Comparisons , 2017, COLT.

[35]  Javier R. Movellan,et al.  Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise , 2009, NIPS.

[36]  A. Culyer Thurstone’s Law of Comparative Judgment , 2014 .

[37]  Martin J. Wainwright,et al.  Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence , 2015, J. Mach. Learn. Res..

[38]  Martin J. Wainwright,et al.  Restricted strong convexity and weighted matrix completion: Optimal bounds with noise , 2010, J. Mach. Learn. Res..

[39]  Arun Rajkumar,et al.  A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data , 2014, ICML.

[40]  L. Thurstone A law of comparative judgment. , 1994 .

[41]  Alexandre d'Aspremont,et al.  Convex Relaxations for Permutation Problems , 2013, SIAM J. Matrix Anal. Appl..

[42]  Mark Braverman,et al.  Noisy sorting without resampling , 2007, SODA '08.

[43]  Martin J. Wainwright,et al.  Worst-case vs Average-case Design for Estimation from Fixed Pairwise Comparisons , 2017, ArXiv.

[44]  E. Rio,et al.  Concentration around the mean for maxima of empirical processes , 2005, math/0506594.

[45]  Devavrat Shah,et al.  Rank Centrality: Ranking from Pairwise Comparisons , 2012, Oper. Res..

[46]  R. Luce,et al.  Stochastic transitivity and cancellation of preferences between bitter-sweet solutions , 1965 .

[47]  Martin J. Wainwright,et al.  Feeling the Bern: Adaptive Estimators for Bernoulli Probabilities of Pairwise Comparisons , 2019, IEEE Transactions on Information Theory.

[48]  Devavrat Shah,et al.  Learning Mixture Model with Missing Values and its Application to Rankings , 2018, ArXiv.

[49]  Martin J. Wainwright,et al.  Simple, Robust and Optimal Ranking from Pairwise Comparisons , 2015, J. Mach. Learn. Res..

[50]  J. Wellner,et al.  Entropy estimate for high-dimensional monotonic functions , 2005, math/0512641.

[51]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[52]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[53]  S. Chatterjee A new perspective on least squares under convex constraint , 2014, 1402.0830.

[54]  A. Tsybakov,et al.  Oracle inequalities for network models and sparse graphon estimation , 2015, 1507.04118.

[55]  R. Dykstra,et al.  Isotonic Regression in Two Independent Variables , 1984 .

[56]  Martin J. Wainwright,et al.  Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues , 2015, IEEE Transactions on Information Theory.

[57]  Adityanand Guntuboyina,et al.  On matrix estimation under monotonicity constraints , 2015, 1506.03430.

[58]  Sewoong Oh,et al.  Learning from Comparisons and Choices , 2017, J. Mach. Learn. Res..

[59]  Edoardo M. Airoldi,et al.  A Consistent Histogram Estimator for Exchangeable Graph Models , 2014, ICML.

[60]  Anirban Dasgupta,et al.  Aggregating crowdsourced binary ratings , 2013, WWW.

[61]  S. Chatterjee,et al.  Matrix estimation by Universal Singular Value Thresholding , 2012, 1212.1247.

[62]  C. Borgs,et al.  Consistent nonparametric estimation for heavy-tailed sparse graphs , 2015, The Annals of Statistics.

[63]  Alon Orlitsky,et al.  Maximum Selection and Ranking under Noisy Comparisons , 2017, ICML.

[64]  Chao Gao,et al.  Exact Exponent in Optimal Rates for Crowdsourcing , 2016, ICML.

[65]  Anup Rao,et al.  Fast, Provable Algorithms for Isotonic Regression in all L_p-norms , 2015, NIPS.

[66]  Jin Zhang,et al.  Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons , 2015, ICML.

[67]  Jian Peng,et al.  Variational Inference for Crowdsourcing , 2012, NIPS.

[68]  Pietro Perona,et al.  The Multidimensional Wisdom of Crowds , 2010, NIPS.

[69]  Sabyasachi Chatterjee,et al.  Isotonic regression in general dimensions , 2017, The Annals of Statistics.

[70]  Lukas Biewald,et al.  Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing , 2011, Human Computation.

[71]  D. Shah,et al.  Unifying Framework for Crowd-sourcing via Graphon Estimation , 2017 .

[72]  E. Gilbert A comparison of signalling alphabets , 1952 .

[73]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[74]  R. A. Bradley,et al.  Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons , 1952 .

[75]  Devavrat Shah,et al.  Iterative Learning for Reliable Crowdsourcing Systems , 2011, NIPS.

[76]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[77]  R. Preston McAfee,et al.  Who moderates the moderators?: crowdsourcing abuse detection in user-generated content , 2011, EC '11.