Doubly Aggressive Selective Sampling Algorithms for Classification

Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.

[1]  Koby Crammer,et al.  Multi-domain learning by confidence-weighted parameter combination , 2010, Machine Learning.

[2]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[3]  Claudio Gentile,et al.  Worst-Case Analysis of Selective Sampling for Linear Classification , 2006, J. Mach. Learn. Res..

[4]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[5]  Claudio Gentile,et al.  Learning noisy linear classifiers via adaptive and selective sampling , 2011, Machine Learning.

[6]  Jürgen Forster,et al.  On Relative Loss Bounds in Generalized Linear Regression , 1999, FCT.

[7]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[8]  Claudio Gentile,et al.  Robust bounds for classification via selective sampling , 2009, ICML '09.

[9]  V. Vovk Competitive On‐line Statistics , 2001 .

[10]  Manfred K. Warmuth,et al.  Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[11]  Koby Crammer,et al.  Confidence-Weighted Linear Classification for Text Categorization , 2012, J. Mach. Learn. Res..

[12]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[13]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[14]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[15]  Koby Crammer,et al.  Adaptive regularization of weight vectors , 2009, Machine Learning.

[16]  Koby Crammer,et al.  Multiclass classification with bandit feedback using adaptive regularization , 2012, Machine Learning.

[17]  Claudio Gentile,et al.  A Second-Order Perceptron Algorithm , 2002, SIAM J. Comput..

[18]  Adam Tauman Kalai,et al.  Analysis of Perceptron-Based Active Learning , 2009, COLT.

[19]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[20]  Francesco Orabona,et al.  Better Algorithms for Selective Sampling , 2011, ICML.

[21]  Claudio Gentile,et al.  Learning Probabilistic Linear-Threshold Classifiers via Selective Sampling , 2003, COLT.

[22]  Philip M. Long,et al.  Apple Tasting , 2000, Inf. Comput..

[23]  Koby Crammer,et al.  Weighted last-step min-max algorithm with improved sub-logarithmic regret , 2012, Theor. Comput. Sci..