Robust bounds for classification via selective sampling

We introduce a new algorithm for binary classification in the selective sampling protocol. Our algorithm uses Regularized Least Squares (RLS) as base classifier, and for this reason it can be efficiently run in any RKHS. Unlike previous margin-based semi-supervised algorithms, our sampling condition hinges on a simultaneous upper bound on bias and variance of the RLS estimate under a simple linear label noise model. This fact allows us to prove performance bounds that hold for an arbitrary sequence of instances. In particular, we show that our sampling strategy approximates the margin of the Bayes optimal classifier to any desired accuracy ε by asking Õ (d/ε2) queries (in the RKHS case d is replaced by a suitable spectral quantity). While these are the standard rates in the fully supervised i.i.d. case, the best previously known result in our harder setting was Õ (d3/ε4). Preliminary experiments show that some of our algorithms also exhibit a good practical performance.

[1]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[2]  Claudio Gentile,et al.  Incremental Algorithms for Hierarchical Classification , 2004, J. Mach. Learn. Res..

[3]  Michael L. Littman,et al.  Online Linear Regression and Its Application to Model-Based Reinforcement Learning , 2007, NIPS.

[4]  Claudio Gentile,et al.  Tracking the best hyperplane with a simple budget Perceptron , 2006, Machine Learning.

[5]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[6]  Adam Tauman Kalai,et al.  Analysis of Perceptron-Based Active Learning , 2009, COLT.

[7]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[8]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[9]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[10]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[11]  Claudio Gentile,et al.  Worst-Case Analysis of Selective Sampling for Linear Classification , 2006, J. Mach. Learn. Res..

[12]  Barbara Caputo,et al.  The projectron: a bounded kernel-based Perceptron , 2008, ICML '08.

[13]  Maria-Florina Balcan,et al.  Margin Based Active Learning , 2007, COLT.

[14]  Claudio Gentile,et al.  Linear Classification and Selective Sampling Under Low Noise Conditions , 2008, NIPS.

[15]  Jason Weston,et al.  Online (and Offline) on an Even Tighter Budget , 2005, AISTATS.

[16]  Yoram Singer,et al.  The Forgetron: A Kernel-Based Perceptron on a Budget , 2008, SIAM J. Comput..