A New Active Learning Scheme with Applications to Learning to Rank from Pairwise Preferences

We consider the statistical learning setting of active learning in which the learner chooses which examples to obtain labels for. We identify a useful general purpose structural property of such learning problems, giving rise to a query-efficient iterative procedure achieving approximately optimal loss at an exponentially fast rate, where the rate is measured in units of error per label. The effectiveness of our ideas is demonstrated on the problem of learning to rank from pairwise preference labels, known as minimum feedback arc-set in tournaments when all the quadratically many preferences are given as input. The net result is an efficient selective sampling method for this problem, achieving a (1 + e)competitive result using only O(n poly(logn, e−1)) preference queries from the quadratically many. This result is information theoretical in nature because it shows how to efficiently select information, not how to use it (computationally) for optimization. Nevertheless, our ideas transfer quite seamlessly to a convex relaxation counterpart, giving rise to an iterative algorithm with an exponential convergence rate to a relaxation optimum. SVM and logistic regression are, in particular, notable examples of relaxation for which this result applies. Such relaxations are popular in applications where the set of alternatives we wish to rank is embedded in a real vector space (feature space), and we wish to fit a permutation induced by a linear function to the preference information. Moreover, in the particular case of constant dimensional feature space, we obtain a slight additional improvement in the query complexity as a function of the number of alternatives using the powerful notion of e-relative approximations in bounded VC dimension spaces. We believe that our iterative scheme and analysis method are interesting in their own right and will find use in other problems. ∗Technion nailon@cs.technion.ac.il †Technion ronbeg@cs.technion.ac.il ‡NYU Courant Institute esther@cims.nyu.edu

[1]  R. Graham,et al.  Spearman's Footrule as a Measure of Disarray , 1977 .

[2]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[3]  Yi Li,et al.  Improved bounds on the sample complexity of learning , 2000, SODA '00.

[4]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[5]  Irit Dinur,et al.  The importance of being biased , 2002, STOC '02.

[6]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[7]  Eyal Kushilevitz,et al.  Distribution-Free Property Testing , 2003, RANDOM-APPROX.

[8]  Bernard Chazelle,et al.  Estimating the distance to a monotone function , 2007, Random Struct. Algorithms.

[9]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[10]  Masashi Sugiyama,et al.  Active Learning for Misspecified Models , 2005, NIPS.

[11]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[12]  Francis R. Bach,et al.  Active learning for misspecified generalized linear models , 2006, NIPS.

[13]  Noga Alon,et al.  Ranking Tournaments , 2006, SIAM J. Discret. Math..

[14]  Atri Rudra,et al.  Ordering by weighted number of wins gives a good ranking for weighted tournaments , 2006, SODA '06.

[15]  Ehsan Chiniforooshan,et al.  On the Complexity of Finding an Unknown Cut Via Vertex Queries , 2007, COCOON.

[16]  Steve Hanneke,et al.  Teaching Dimension and the Complexity of Active Learning , 2007, COLT.

[17]  Pankaj K. Agarwal,et al.  Geometric Range Searching and Its Relatives , 2007 .

[18]  Claire Mathieu,et al.  Electronic Colloquium on Computational Complexity, Report No. 144 (2006) How to rank with few errors A PTAS for Weighted Feedback Arc Set on Tournaments , 2006 .

[19]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[20]  Mark Braverman,et al.  Noisy sorting without resampling , 2007, SODA '08.

[21]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[22]  Steve Hanneke,et al.  Theoretical foundations of active learning , 2009 .

[23]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[24]  Claudio Gentile,et al.  Active Learning on Trees and Graphs , 2010, COLT.

[25]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[26]  Micha Sharir,et al.  Relative (p,ε)-Approximations in Geometry , 2011, Discret. Comput. Geom..

[27]  Jeff A. Bilmes,et al.  Active Semi-Supervised Learning using Submodular Functions , 2011, UAI.

[28]  Nir Ailon,et al.  An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity , 2010, J. Mach. Learn. Res..