Beyond Disagreement-Based Agnostic Active Learning

We study agnostic active learning, where the goal is to learn a classifier in a pre-specified hypothesis class interactively with as few label queries as possible, while making no assumptions on the true function generating the labels. The main algorithms for this problem are {\em{disagreement-based active learning}}, which has a high label requirement, and {\em{margin-based active learning}}, which only applies to fairly restricted settings. A major challenge is to find an algorithm which achieves better label complexity, is consistent in an agnostic setting, and applies to general classification problems. In this paper, we provide such an algorithm. Our solution is based on two novel contributions -- a reduction from consistent active learning to confidence-rated prediction with guaranteed error, and a novel confidence-rated predictor.

[1]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[2]  Steve Hanneke,et al.  Adaptive Rates of Convergence in Active Learning , 2009, COLT.

[3]  Sanjoy Dasgupta,et al.  Two faces of active learning , 2011, Theor. Comput. Sci..

[4]  Ran El-Yaniv,et al.  Active Learning via Perfect Selective Classification , 2012, J. Mach. Learn. Res..

[5]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[6]  Sanjoy Dasgupta,et al.  Hierarchical sampling for active learning , 2008, ICML '08.

[7]  Daniel J. Hsu Algorithms for active learning , 2010 .

[8]  Ran El-Yaniv,et al.  On the Foundations of Noise-free Selective Classification , 2010, J. Mach. Learn. Res..

[9]  Robert D. Nowak,et al.  The Geometry of Generalized Binary Search , 2009, IEEE Transactions on Information Theory.

[10]  Vladimir Koltchinskii,et al.  Rademacher Complexities and Bounding the Excess Risk in Active Learning , 2010, J. Mach. Learn. Res..

[11]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[12]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[13]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[14]  Y. Mansour,et al.  Generalization bounds for averaged classifiers , 2004, math/0410092.

[15]  Tara Javidi,et al.  Noisy Bayesian active learning , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[16]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[17]  Maria-Florina Balcan,et al.  Active and passive learning of linear separators under log-concave distributions , 2012, COLT.

[18]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[19]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[20]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[21]  Ran El-Yaniv,et al.  Agnostic Selective Classification , 2011, NIPS.

[22]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[23]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[24]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[25]  Shai Ben-David,et al.  PLAL: Cluster-based active learning , 2013, COLT.

[26]  Liu Yang,et al.  Surrogate Losses in Passive and Active Learning , 2012, Electronic Journal of Statistics.

[27]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[28]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[29]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.