Active Algorithm Selection

Most previous studies on active learning focused on the problem of model selection, i.e., how to identify the optimal classification model from a family of predefined models using a small, carefully selected training set. In this paper, we address the problem of active algorithm selection. The goal of this problem is to efficiently identify the optimal learning algorithm for a given dataset from a set of algorithms using a small training set. In this study, we present a general framework for active algorithm selection by extending the idea of the Hedge algorithm. It employs the worst case analysis to identify the example that can effectively increase the weighted loss function defined in the Hedge algorithm. We further extend the framework by incorporating the correlation information among unlabeled examples to accurately estimate the change in the weighted loss function, and Maximum Entropy Discrimination to automatically determine the combination weights used by the Hedge algorithm. Our empirical study with the datasets of WCCI 2006 performance prediction challenge shows promising performance of the proposed framework for active algorithm selection.

[1]  Prasad Tadepalli,et al.  Active Learning with Committees for Text Categorization , 1997, AAAI/IAAI.

[2]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[3]  Foster J. Provost,et al.  Active Sampling for Class Probability Estimation and Ranking , 2004, Machine Learning.

[4]  Tong Zhang,et al.  The Value of Unlabeled Data for Classification Problems , 2000, ICML 2000.

[5]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[6]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[7]  Tommi S. Jaakkola,et al.  Maximum Entropy Discrimination , 1999, NIPS.

[8]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[9]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[10]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[11]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[12]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[13]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[14]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.