Boosting for class-imbalanced datasets using genetically evolved supervised non-linear projections

It has repeatedly been shown that most classification methods suffer from an imbalanced distribution of training instances among classes. Most learning algorithms expect an approximately even distribution of instances among the different classes and suffer, to different degrees, when that is not the case. Dealing with the class-imbalance problem is a difficult but relevant task, as many of the most interesting and challenging real-world problems have a very uneven class distribution. In this paper we present a new approach for dealing with class-imbalanced datasets based on a new boosting method for the construction of ensembles of classifiers. The approach is based on using the distribution of the weights given by a given boosting algorithm for obtaining a supervised projection. Then, the supervised projection is used to train the next classifier using a uniform distribution of the training instances. We tested our method using 35 class-imbalanced datasets and two different base classifiers: a decision tree and a support vector machine. The proposed methodology proved its usefulness achieving better accuracy than other methods both in terms of the geometric mean of specificity and sensibility and the area under the ROC curve.

[1]  Nicolás García-Pedrajas,et al.  Constructing ensembles of classifiers using supervised projection methods based on misclassified instances , 2011, Expert Syst. Appl..

[2]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[3]  Its'hak Dinstein,et al.  On pattern classification with Sammon's nonlinear mapping an experimental study , 1998, Pattern Recognit..

[4]  C. James Li,et al.  Dynamic Projection Network for Supervised Pattern Classification , 2004 .

[5]  J. Kruskal TOWARD A PRACTICAL METHOD WHICH HELPS UNCOVER THE STRUCTURE OF A SET OF MULTIVARIATE OBSERVATIONS BY FINDING THE LINEAR TRANSFORMATION WHICH OPTIMIZES A NEW “INDEX OF CONDENSATION” , 1969 .

[6]  Francisco Herrera,et al.  Tackling Real-Coded Genetic Algorithms: Operators and Tools for Behavioural Analysis , 1998, Artificial Intelligence Review.

[7]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[8]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Zhongliang Jing,et al.  Local structure based supervised feature extraction , 2006, Pattern Recognit..

[11]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[12]  Nicolás García-Pedrajas,et al.  Scaling up data mining algorithms: review and taxonomy , 2012, Progress in Artificial Intelligence.

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  Juan José Rodríguez Diez,et al.  Ensembles of Decision Trees for Imbalanced Data , 2011, MCS.

[15]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[16]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[17]  Yu Yan,et al.  The Method of Text Categorization on Imbalanced Datasets , 2009, 2009 International Conference on Communication Software and Networks.

[18]  María José del Jesús,et al.  Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning , 2010, IPMU.

[19]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[20]  Nicolás García-Pedrajas,et al.  Nonlinear Boosting Projections for Ensemble Construction , 2007, J. Mach. Learn. Res..

[21]  Eun-Kyung Lee,et al.  Projection Pursuit for Exploratory Supervised Classification , 2005 .

[22]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[23]  Foster Provost,et al.  The effect of class distribution on classifier learning: an empirical study , 2001 .

[24]  Juan José Rodríguez Diez,et al.  Disturbing Neighbors Diversity for Decision Forests , 2009, Applications of Supervised and Unsupervised Ensemble Methods.

[25]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[26]  Javier Pérez-Rodríguez,et al.  Class imbalance methods for translation initiation site recognition in DNA sequences , 2012, Knowl. Based Syst..

[27]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[28]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[29]  Christopher J. Merz,et al.  Using Correspondence Analysis to Combine Classifiers , 1999, Machine Learning.

[30]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[31]  César Hervás-Martínez,et al.  Cooperative coevolution of artificial neural network ensembles for pattern classification , 2005, IEEE Transactions on Evolutionary Computation.

[32]  Zhi-Hua Zhou,et al.  Supervised nonlinear dimensionality reduction for visualization and classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  Geoffrey I. Webb,et al.  MultiBoosting: A Technique for Combining Boosting and Wagging , 2000, Machine Learning.

[34]  Ron Kohavi,et al.  Option Decision Trees with Majority Votes , 1997, ICML.

[35]  Jörg Polzehl,et al.  Projection pursuit discriminant analysis , 1995 .

[36]  Nicolás García-Pedrajas,et al.  Constructing Ensembles of Classifiers by Means of Weighted Instance Selection , 2009, IEEE Transactions on Neural Networks.

[37]  Nathalie Japkowicz,et al.  The Class Imbalance Problem: Significance and Strategies , 2000 .

[38]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[39]  Nicolás García-Pedrajas,et al.  Supervised projection approach for boosting classifiers , 2009, Pattern Recognit..

[40]  K. Moon,et al.  Margin preserving projection , 2006 .

[41]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1992, Artificial Intelligence.

[42]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[43]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[44]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[45]  Hans-Peter Kriegel,et al.  Multi-Output Regularized Feature Projection , 2006, IEEE Transactions on Knowledge and Data Engineering.

[46]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[47]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[48]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[49]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[50]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[51]  Shuo Wang,et al.  Ensemble diversity for class imbalance learning , 2011 .

[52]  Nicolás García-Pedrajas,et al.  Democratic instance selection: A linear complexity instance selection algorithm based on classifier ensemble concepts , 2010, Artif. Intell..

[53]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[54]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[55]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[56]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[57]  Robert Givan,et al.  Online Ensemble Learning: An Empirical Study , 2000, Machine Learning.

[58]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[59]  Ana I. González Acuña An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting, and Randomization , 2012 .

[60]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[61]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .