Agnostic active learning

We state and analyze the first active learning algorithm which works in the presence of arbitrary forms of noise. The algorithm, A2 (for Agnostic Active), relies only upon the assumption that the samples are drawn i.i.d. from a fixed distribution. We show that A2 achieves an exponential improvement (i.e., requires only O (ln 1/ε) samples to find an ε-optimal classifier) over the usual sample complexity of supervised learning, for several settings considered before in the realizable case. These include learning threshold classifiers and learning homogeneous linear separators with respect to an input distribution which is uniform over the unit sphere.

[1]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[2]  Nader H. Bshouty,et al.  Learning Monotone DNF from a Teacher that Almost Does Not Answer Membership Queries , 2001, J. Mach. Learn. Res..

[3]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[4]  Robert D. Nowak,et al.  Faster Rates in Regression via Active Learning , 2005, NIPS.

[5]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[6]  Sanjoy Dasgupta,et al.  Analysis of a greedy active learning strategy , 2004, NIPS.

[7]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[8]  Maria-Florina Balcan,et al.  The true sample complexity of active learning , 2010, Machine Learning.

[9]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[10]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[11]  Adam Tauman Kalai,et al.  Analysis of Perceptron-Based Active Learning , 2009, COLT.

[12]  Byoung-Tak Zhang,et al.  Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information , 2004, Inf. Process. Manag..

[13]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[14]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[15]  Philip M. Long On the sample complexity of PAC learning half-spaces against the uniform distribution , 1995, IEEE Trans. Neural Networks.

[16]  Maria-Florina Balcan,et al.  A PAC-Style Model for Learning from Labeled and Unlabeled Data , 2005, COLT.

[17]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[18]  Paul A. Viola,et al.  Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Peter L. Bartlett,et al.  Learning in Neural Networks: Theoretical Foundations , 1999 .

[20]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[21]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[22]  Dana Angluin Queries revisited , 2004, Theor. Comput. Sci..

[23]  Andrzej Pelc,et al.  Ulam's searching game with lies , 1989, J. Comb. Theory, Ser. A.

[24]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[25]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[26]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[27]  S. Boucheron,et al.  Theory of classification : a survey of some recent advances , 2005 .

[28]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[29]  Queries and Concept Learning , 2004, Machine Learning.

[30]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.