Hybrid Classifier Ensemble for Imbalanced Data

The class imbalance problem has become a leading challenge. Although conventional imbalance learning methods are proposed to tackle this problem, they have some limitations: 1) undersampling methods suffer from losing important information and 2) cost-sensitive methods are sensitive to outliers and noise. To address these issues, we propose a hybrid optimal ensemble classifier framework that combines density-based undersampling and cost-effective methods through exploring state-of-the-art solutions using multi-objective optimization algorithm. Specifically, we first develop a density-based undersampling method to select informative samples from the original training data with probability-based data transformation, which enables to obtain multiple subsets following a balanced distribution across classes. Second, we exploit the cost-sensitive classification method to address the incompleteness of information problem via modifying weights of misclassified minority samples rather than the majority ones. Finally, we introduce a multi-objective optimization procedure and utilize connections between samples to self-modify the classification result using an ensemble classifier framework. Extensive comparative experiments conducted on real-world data sets demonstrate that our method outperforms the majority of imbalance and ensemble classification approaches.

[1]  Vipin Kumar,et al.  Evaluating boosting algorithms to classify rare classes: comparison and improvements , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[2]  Fang Li,et al.  Multi-objective evolutionary algorithms embedded with machine learning — A survey , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[3]  Jesús Alcalá-Fdez,et al.  KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework , 2011, J. Multiple Valued Log. Soft Comput..

[4]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[5]  Tom Heskes,et al.  Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers , 2017, BMC Bioinformatics.

[6]  Yan-Ping Zhang,et al.  Cluster-based majority under-sampling approaches for class imbalance learning , 2010, 2010 2nd IEEE International Conference on Information and Financial Engineering.

[7]  C. L. Philip Chen,et al.  A Multiple-Feature and Multiple-Kernel Scene Segmentation Algorithm for Humanoid Robot , 2014, IEEE Transactions on Cybernetics.

[8]  Gary Weiss,et al.  Does cost-sensitive learning beat sampling for classifying rare classes? , 2005, UBDM '05.

[9]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[10]  André Gustavo dos Santos,et al.  Application of NSGA-II framework to the travel planning problem using real-world travel data , 2016, CEC.

[11]  David J. Hand,et al.  A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems , 2001, Machine Learning.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Alexander G. Gray,et al.  Stochastic Alternating Direction Method of Multipliers , 2013, ICML.

[14]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[15]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[16]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Data Mining Researchers , 2003 .

[17]  Jürgen Branke,et al.  Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[18]  Tom Fawcett,et al.  Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions , 1997, KDD.

[19]  Hareton K. N. Leung,et al.  Hybrid $k$ -Nearest Neighbor Classifier , 2016, IEEE Transactions on Cybernetics.

[20]  Jane You,et al.  A New Kind of Nonparametric Test for Statistical Comparison of Multiple Classifiers Over Multiple Datasets , 2017, IEEE Transactions on Cybernetics.

[21]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[22]  Igor Kononenko,et al.  Cost-Sensitive Learning with Neural Networks , 1998, ECAI.

[23]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[24]  Kai Ming Ting,et al.  An Instance-weighting Method to Induce Cost-sensitive Trees , 2001 .

[25]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[26]  Hisao Ishibuchi,et al.  Performance comparison of NSGA-II and NSGA-III on various many-objective test problems , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[27]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[28]  David B. Fogel,et al.  An introduction to simulated evolutionary optimization , 1994, IEEE Trans. Neural Networks.

[29]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[30]  Kun Zhang,et al.  A novel algorithm for many-objective dimension reductions: Pareto-PCA-NSGA-II , 2014, 2014 IEEE Congress on Evolutionary Computation (CEC).

[31]  Seetha Hari,et al.  Learning From Imbalanced Data , 2019, Advances in Computer and Electrical Engineering.

[32]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[33]  Zhendong Niu,et al.  Distribution based ensemble for class imbalance learning , 2015, Fifth International Conference on the Innovative Computing Technology (INTECH 2015).

[34]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[35]  Zhen Liu,et al.  Objective cost-sensitive-boosting-WELM for handling multi class imbalance problem , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[36]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[37]  Francisco Herrera,et al.  Learning from Imbalanced Data Sets , 2018, Springer International Publishing.

[38]  Maya R. Gupta,et al.  Functional Bregman divergence , 2008, 2008 IEEE International Symposium on Information Theory.

[39]  Kay Chen Tan,et al.  Evolutionary Cluster-Based Synthetic Oversampling Ensemble (ECO-Ensemble) for Imbalance Learning , 2017, IEEE Transactions on Cybernetics.

[40]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[41]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[42]  Zili Zhang,et al.  Sample Subset Optimization Techniques for Imbalanced and Ensemble Learning Problems in Bioinformatics Applications , 2014, IEEE Transactions on Cybernetics.

[43]  Brendon J. Woodford,et al.  A streaming ensemble classifier with multi-class imbalance learning for activity recognition , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[44]  Francisco Herrera,et al.  Analysis of Data Preprocessing Increasing the Oversampling Ratio for Extremely Imbalanced Big Data Classification , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[45]  Foster Provost,et al.  The effect of class distribution on classifier learning: an empirical study , 2001 .

[46]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[47]  Francisco Herrera,et al.  An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics , 2013, Inf. Sci..

[48]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[49]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[50]  Yanbin Yuan,et al.  Modified NSGA-II for Solving Continuous Berth Allocation Problem: Using Multiobjective Constraint-Handling Strategy , 2017, IEEE Transactions on Cybernetics.

[51]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[52]  Igor Vajda,et al.  On Bregman Distances and Divergences of Probability Measures , 2012, IEEE Transactions on Information Theory.

[53]  Robert C. Holte,et al.  Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria , 2000, ICML.

[54]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[55]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[56]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[57]  Zhi-Hua Zhou,et al.  The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study , 2006, Sixth International Conference on Data Mining (ICDM'06).

[58]  Yi Mei,et al.  A NSGA-II-based approach for service resource allocation in Cloud , 2017, 2017 IEEE Congress on Evolutionary Computation (CEC).

[59]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[60]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[61]  Joydeep Ghosh,et al.  An Optimization Framework for Combining Ensembles of Classifiers and Clusterers with Applications to Nontransductive Semisupervised Learning and Transfer Learning , 2014, TKDD.

[62]  Dimitri P. Bertsekas,et al.  On the Douglas—Rachford splitting method and the proximal point algorithm for maximal monotone operators , 1992, Math. Program..

[63]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[64]  Francisco Herrera,et al.  An insight into imbalanced Big Data classification: outcomes and challenges , 2017 .

[65]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[66]  Jane You,et al.  Hybrid cluster ensemble framework based on the random combination of data transformation operators , 2012, Pattern Recognit..

[67]  David Mease,et al.  Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..

[68]  Pranab K. Muhuri,et al.  NSGA-II based multi-objective pollution routing problem with higher order uncertainty , 2017, 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[69]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[70]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints , 2014, IEEE Transactions on Evolutionary Computation.

[71]  Jorma Laurikkala,et al.  Improving Identification of Difficult Small Classes by Balancing Class Distribution , 2001, AIME.