Active Learning with Model Selection

Most active learning methods avoid model selection by training models of one type (SVMs, boosted trees, etc.) using one pre-defined set of model hyperparameters. We propose an algorithm that actively samples data to simultaneously train a set of candidate models (different model types and/or different hyperparameters) and also select the best model from this set. The algorithm actively samples points for training that are most likely to improve the accuracy of the more promising candidate models, and also samples points for model selection-- all samples count against the same labeling budget. This exposes a natural trade-off between the focused active sampling that is most effective for training models, and the unbiased sampling that is better for model selection. We empirically demonstrate on six test problems that this algorithm is nearly as effective as an active learning oracle that knows the optimal model in advance.

[1]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[2]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[3]  Dan Roth,et al.  Learning cost-sensitive active classifiers , 2002, Artif. Intell..

[4]  Yishay Mansour,et al.  Learning Bounds for Importance Weighting , 2010, NIPS.

[5]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[6]  Adam Tauman Kalai,et al.  Analysis of Perceptron-Based Active Learning , 2009, COLT.

[7]  Klaus-Robert Müller,et al.  Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[8]  John Langford,et al.  Doubly Robust Policy Evaluation and Learning , 2011, ICML.

[9]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[10]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[11]  Kiri Wagstaff,et al.  Active Learning with Irrelevant Examples , 2006, ECML.

[12]  Russell Greiner,et al.  Active Model Selection , 2004, UAI.

[13]  Jennifer G. Dy,et al.  Active Learning from Multiple Knowledge Sources , 2012, AISTATS.

[14]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[15]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[18]  Bernhard Schölkopf,et al.  Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.

[19]  Masashi Sugiyama,et al.  Active Learning with Model Selection in Linear Regression , 2008, SDM.

[20]  David J. Miller,et al.  Critic-driven ensemble classification , 1999, IEEE Trans. Signal Process..

[21]  Steffen Bickel,et al.  Active Risk Estimation , 2010, ICML.

[22]  Mehryar Mohri,et al.  Sample Selection Bias Correction Theory , 2008, ALT.

[23]  Eric Horvitz,et al.  Breaking Boundaries Between Induction Time and Diagnosis Time Active Information Acquisition , 2009, NIPS.

[24]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[25]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.