Online Choice of Active Learning Algorithms

This paper is concerned with the question of how to online combine an ensemble of active learners so as to expedite the learning progress during a pool-based active learning session. We develop a powerful active learning master algorithm, based a known competitive algorithm for the multi-armed bandit problem and a novel semi-supervised performance evaluation statistic. Taking an ensemble containing two of the best known active learning algorithms and a new algorithm, the resulting new active learning master algorithm is empirically shown to consistently perform almost as well as and sometimes outperform the best algorithm in the ensemble on a range of classification problems.

[1]  F. Cole To the Best of Our Knowledge , 1979 .

[2]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[3]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[4]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[5]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[6]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[7]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[8]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[9]  Isabelle Guyon,et al.  Discovering Informative Patterns and Data Cleaning , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[11]  N. Mati,et al.  Discovering Informative Patterns and Data Cleaning , 1996 .

[12]  Manfred K. Warmuth,et al.  How to use expert advice , 1997, JACM.

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[15]  Kamal Nigamyknigam,et al.  Employing Em in Pool-based Active Learning for Text Classiication , 1998 .

[16]  Nello Cristianini,et al.  Further results on the margin distribution , 1999, COLT '99.

[17]  Eli Shamir,et al.  Query by Committee, Linear Separation and Random Walks , 1999, EuroCOLT.

[18]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[19]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[20]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[21]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[22]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[23]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[24]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[25]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[26]  Colin Campbell,et al.  Bayes Point Machines , 2001, J. Mach. Learn. Res..

[27]  Foster J. Provost,et al.  Active Learning for Class Probability Estimation and Ranking , 2001, IJCAI.

[28]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[29]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[30]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[31]  Partha Niyogi,et al.  Almost-everywhere Algorithmic Stability and Generalization Error , 2002, UAI.

[32]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[33]  Nello Cristianini,et al.  On the generalization of soft margin algorithms , 2002, IEEE Trans. Inf. Theory.

[34]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[35]  Gunnar Rätsch,et al.  Soft Margins for AdaBoost , 2001, Machine Learning.

[36]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.

[37]  Foster J. Provost,et al.  Active Sampling for Class Probability Estimation and Ranking , 2004, Machine Learning.

[38]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.