论文信息 - Doubly Aggressive Selective Sampling Algorithms for Classification

Doubly Aggressive Selective Sampling Algorithms for Classification

Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.

Koby Crammer

[1] Koby Crammer,et al. Multi-domain learning by confidence-weighted parameter combination , 2010, Machine Learning.

[2] John Langford,et al. Importance weighted active learning , 2008, ICML '09.

[3] Claudio Gentile,et al. Worst-Case Analysis of Selective Sampling for Linear Classification , 2006, J. Mach. Learn. Res..

[4] John Langford,et al. Agnostic Active Learning Without Constraints , 2010, NIPS.

[5] Claudio Gentile,et al. Learning noisy linear classifiers via adaptive and selective sampling , 2011, Machine Learning.

[6] Jürgen Forster,et al. On Relative Loss Bounds in Generalized Linear Regression , 1999, FCT.

[7] John Blitzer,et al. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[8] Claudio Gentile,et al. Robust bounds for classification via selective sampling , 2009, ICML '09.

[9] V. Vovk. Competitive On‐line Statistics , 2001 .

[10] Manfred K. Warmuth,et al. Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.

[11] Koby Crammer,et al. Confidence-Weighted Linear Classification for Text Categorization , 2012, J. Mach. Learn. Res..