Active Learning through Adaptive Heterogeneous Ensembling

An open question in ensemble-based active learning is how to choose one classifier type, or appropriate combinations of multiple classifier types, to construct ensembles for a given task. While existing approaches typically choose one classifier type, this paper presents a method that trains and adapts multiple instances of multiple classifier types toward an appropriate ensemble during active learning. The method is termed adaptive heterogeneous ensembles (henceforth referred to as AHE). Experimental evaluations show that AHE constructs heterogeneous ensembles that outperform homogeneous ensembles composed of any one of the classifier types, as well as bagging, boosting and the random subspace method with random sampling. We also show in this paper that the advantage of AHE over other methods is increased if (1) the overall size of the ensemble also adapts during learning; and (2) the target data set is composed of more than two class labels. Through analysis we show that the AHE outperforms other methods because it automatically discovers complementary classifiers: for each data instance in the data set, instances of the classifier type best suited for that data point vote together, while instances of the other, inappropriate classifier types disagree, thereby producing a correct overall majority vote.

[1]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[2]  Naoki Abe,et al.  Query Learning Strategies Using Boosting and Bagging , 1998, ICML.

[3]  Jun Du,et al.  Asking Generalized Queries to Domain Experts to Improve Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[4]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[5]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[6]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[7]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.

[8]  Xindong Wu,et al.  Active Learning with Adaptive Heterogeneous Ensembles , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[9]  Raymond J. Mooney,et al.  Constructing Diverse Classifier Ensembles using Artificial Training Examples , 2003, IJCAI.

[10]  Enhong Chen,et al.  Ensemble Pruning via Constrained Eigen-Optimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[11]  Huanhuan Chen,et al.  Multiobjective Neural Network Ensembles Based on Regularized Negative Correlation Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Andrew McCallum,et al.  Reducing Labeling Effort for Structured Prediction Tasks , 2005, AAAI.

[13]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[14]  HoTin Kam The Random Subspace Method for Constructing Decision Forests , 1998 .

[15]  C H HoiSteven,et al.  Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval , 2009 .

[16]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[17]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[18]  David D. Lewis,et al.  Heterogeneous Uncertainty Sampling for Supervised Learning , 1994, ICML.

[19]  David A. Cohn,et al.  Training Connectionist Networks with Queries and Selective Sampling , 1989, NIPS.

[20]  Juan José Rodríguez Diez,et al.  Classifier Ensembles with a Random Linear Oracle , 2007, IEEE Transactions on Knowledge and Data Engineering.

[21]  Rong Jin,et al.  Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval , 2009, IEEE Transactions on Knowledge and Data Engineering.

[22]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[24]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[25]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[26]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[27]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[28]  Rich Caruana,et al.  Getting the Most Out of Ensemble Selection , 2006, Sixth International Conference on Data Mining (ICDM'06).

[29]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[30]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[31]  Kamal Nigamyknigam,et al.  Employing Em in Pool-based Active Learning for Text Classiication , 1998 .

[32]  Gökhan Tür,et al.  Combining active and semi-supervised learning for spoken language understanding , 2005, Speech Commun..

[33]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[35]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.