论文信息 - Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

In supervised learning, we leverage a labeled dataset to design methods for function estimation. In many practical situations, we are able to obtain alternative feedback, possibly at a low cost. A broad goal is to understand the usefulness of, and to design algorithms to exploit, this alternative feedback. We focus on a semi-supervised setting where we obtain additional ordinal (or comparison) information for potentially unlabeled samples. We consider ordinal feedback of varying qualities where we have either a perfect ordering of the samples, a noisy ordering of the samples or noisy pairwise comparisons between the samples. We provide a precise quantification of the usefulness of these types of ordinal feedback in non-parametric regression, showing that in many cases it is possible to accurately estimate an underlying function with a very small labeled set, effectively escaping the curse of dimensionality. We develop an algorithm called Ranking-Regression (RR) and analyze its accuracy as a function of size of the labeled and unlabeled datasets and various noise parameters. We also present lower bounds, that establish fundamental limits for the task and show that RR is optimal in a variety of settings. Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.

[1] L. Thurstone. A law of comparative judgment. , 1994 .

[2] Robert D. Nowak,et al. Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[3] Arpit Agarwal,et al. Learning with Limited Rounds of Adaptivity: Coin Tossing, Multi-Armed Bandits, and Ranking from Pairwise Comparisons , 2017, COLT.

[4] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.

[5] Martin J. Wainwright,et al. Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence , 2015, J. Mach. Learn. Res..

[6] S. Vempala,et al. The geometry of logconcave functions and sampling algorithms , 2007 .

[7] Sanjoy Dasgupta,et al. Rates of convergence for the cluster tree , 2010, NIPS.

[8] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[9] Shachar Lovett,et al. Active Classification with Comparison Queries , 2017, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[10] A. Tsybakov,et al. Optimal aggregation of classifiers in statistical learning , 2003 .

[11] Cun-Hui Zhang. Risk bounds in isotonic regression , 2002 .

[12] Thorsten Joachims,et al. Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[13] A. Culyer. Thurstone’s Law of Comparative Judgment , 2014 .

[14] Ambuj Tewari,et al. Consistent algorithms for multiclass classification with an abstain option , 2018 .

[15] Maya R. Gupta,et al. How to Analyze Paired Comparison Data , 2011 .

[16] Mark Braverman,et al. Sorting from Noisy Information , 2009, ArXiv.

[17] Martin J. Wainwright,et al. Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues , 2015, IEEE Transactions on Information Theory.

[18] Pierre C. Bellec,et al. Sharp oracle bounds for monotone and convex regression through aggregation , 2015, J. Mach. Learn. Res..

[19] R. Luce,et al. Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[20] P. Bellec. Sharp oracle inequalities for Least Squares estimators in shape restricted regression , 2015, 1510.08029.

[21] Martin J. Wainwright,et al. A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness , 2016, IEEE Transactions on Information Theory.