Results of the Active Learning Challenge

We organized a machine learning challenge on \active learning", addressing problems where labeling data is expensive, but large amounts of unlabeled data are available at low cost. Examples include handwriting and speech recognition, document classication, vision tasks, drug design using recombinant molecules and protein engineering. The algorithms may place a limited number of queries to get new sample labels. The design of the challenge and its results are summarized in this paper and the best contributions made by the participants are included in these proceedings. The website of the challenge remains open as a resource for students and researchers (http://clopinet.com/al).

[1]  Gavin C. Cawley Some Baseline Methods for the Active Learning Challenge , 2010 .

[2]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[3]  Daphne Koller,et al.  Active Learning for Parameter Estimation in Bayesian Networks , 2000, NIPS.

[4]  Mark Craven,et al.  Multiple-Instance Active Learning , 2007, NIPS.

[5]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[6]  V. Sindhwani,et al.  Newton Methods for Fast Solution of Semi- supervised Linear SVMs , 2006 .

[7]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[8]  S. Sathiya Keerthi,et al.  Large scale semi-supervised linear SVMs , 2006, SIGIR.

[9]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[10]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[11]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[12]  David M. Allen,et al.  The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction , 1974 .

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[15]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[16]  Eugene Tuv,et al.  Tree-Based Ensembles with Dynamic Soft Feature Selection , 2006, Feature Extraction.

[17]  I. Guyon,et al.  Performance Prediction Challenge , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[18]  Gavin C. Cawley,et al.  Optimally regularised kernel Fisher discriminant classification , 2007, Neural Networks.

[19]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[20]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[21]  Christophe Salperwyck,et al.  Post-hoc experiments for the active learning challenge , 2010 .