Creating diversity in ensembles using artificial data

Abstract The diversity of an ensemble of classifiers is known to be an important factor in determining its generalization error. We present a new method for generating ensembles, D ecorate (Diverse Ensemble Creation by Oppositional Relabeling of Artificial Training Examples), that directly constructs diverse hypotheses using additional artificially constructed training examples. The technique is a simple, general meta-learner that can use any strong learner as a base classifier to build diverse committees. Experimental results using decision-tree induction as a base learner demonstrate that this approach consistently achieves higher predictive accuracy than the base classifier, Bagging and Random Forests. D ecorate also obtains higher accuracy than Boosting on small training sets, and achieves comparable performance on larger training sets.

[1]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[2]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[3]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[4]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[5]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[6]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[7]  Nikunj C. Oza,et al.  Online Ensemble Learning , 2000, AAAI/IAAI.

[8]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[9]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[10]  Bruce E. Rosen,et al.  Ensemble Learning Using Decorrelated Neural Networks , 1996, Connect. Sci..

[11]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[12]  Nathan Intrator,et al.  Bootstrapping with Noise: An Effective Regularization Technique , 1996, Connect. Sci..

[13]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[14]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[17]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[18]  David W. Opitz,et al.  An Empirical Evaluation of Bagging and Boosting , 1997, AAAI/IAAI.

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  Gavin Brown,et al.  The Use of the Ambiguity Decomposition in Neural Network Ensemble Learning Methods , 2003, ICML.

[21]  Michael Collins,et al.  Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[22]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[23]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[24]  David W. Opitz,et al.  Actively Searching for an E(cid:11)ective Neural-Network Ensemble , 1996 .

[25]  Arun D Kulkarni,et al.  Neural Networks for Pattern Recognition , 1991 .

[26]  Raymond J. Mooney,et al.  Constructing Diverse Classifier Ensembles using Artificial Training Examples , 2003, IJCAI.

[27]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[28]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[29]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[30]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[31]  Raymond J. Mooney,et al.  Experiments on Ensembles with Missing and Noisy Data , 2004, Multiple Classifier Systems.

[32]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[33]  Nikunj C. Oza,et al.  Decimated input ensembles for improved generalization , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[34]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[35]  Raymond J. Mooney,et al.  Diverse ensembles for active learning , 2004, ICML.

[36]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[37]  Pedro M. Domingos Knowledge Acquisition from Examples Via Multiple Models , 1997 .

[38]  Padraig Cunningham,et al.  Using Diversity in Preparing Ensembles of Classifiers Based on Different Feature Subsets to Minimize Generalization Error , 2001, ECML.

[39]  Yoram Singer,et al.  An Efficient Boosting Algorithm for Combining Preferences by , 2013 .

[40]  Xin Yao,et al.  Ensemble learning via negative correlation , 1999, Neural Networks.

[41]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[42]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .