Multi-Imbalance: An open-source software for multi-class imbalance learning

Abstract Imbalance classification is one of the most challenging research problems in machine learning. Techniques for two-class imbalance classification are relatively mature nowadays, yet multi-class imbalance learning is still an open problem. Moreover, the community lacks a suitable software tool that can integrate the major works in the field. In this paper, we present Multi-Imbalance, an open source software package for multi-class imbalanced data classification. It provides users with seven different categories of multi-class imbalance learning algorithms, including the latest advances in the field. The source codes and documentations for Multi-Imbalance are publicly available at https://github.com/chongshengzhang/Multi_Imbalance .

[1]  Thomas G. Dietterich,et al.  Error-Correcting Output Codes: A General Method for Improving Multiclass Inductive Learning Programs , 1991, AAAI.

[2]  Luis Baumela,et al.  Multi-class boosting with asymmetric binary weak-learners , 2014, Pattern Recognit..

[3]  Xiangliang Zhang,et al.  An up-to-date comparison of state-of-the-art classification algorithms , 2017, Expert Syst. Appl..

[4]  Fan Yang,et al.  CASQ: Adaptive and cloud-assisted query processing in vehicular sensor networks , 2019, Future Gener. Comput. Syst..

[5]  Francisco Herrera,et al.  IFROWANN: Imbalanced Fuzzy-Rough Ordered Weighted Average Nearest Neighbor Classification , 2015, IEEE Transactions on Fuzzy Systems.

[6]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[7]  Ming Li,et al.  Constructing IGA-suitable planar parameterization from complex CAD boundary by domain partition and global/local optimization , 2017, ArXiv.

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[9]  Régis Duvigneau,et al.  Analysis-suitable volume parameterization of multi-block computational domain in isogeometric applications , 2013, Comput. Aided Des..

[10]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[11]  Huanhuan Chen,et al.  Negative correlation learning for classification ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[12]  Yang Wang,et al.  Boosting for Learning Multiple Classes with Imbalanced Class Distribution , 2006, Sixth International Conference on Data Mining (ICDM'06).

[13]  You-Shyang Chen An empirical study of a hybrid imbalanced-class DT-RST classification procedure to elucidate therapeutic effects in uremia patients , 2016, Medical & Biological Engineering & Computing.

[14]  Zhi-Hua Zhou,et al.  Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights , 2013, 2013 IEEE 13th International Conference on Data Mining.

[15]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[16]  Chongsheng Zhang,et al.  An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme , 2018, Knowl. Based Syst..

[17]  Svetha Venkatesh,et al.  Multi-class Pattern Classification in Imbalanced Data , 2010, 2010 20th International Conference on Pattern Recognition.

[18]  Q. Henry Wu,et al.  Association Rule Mining-Based Dissolved Gas Analysis for Fault Diagnosis of Power Transformers , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[19]  Nicolás García-Pedrajas,et al.  Improving multiclass pattern recognition by the combination of two strategies , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Lawrence Mosley,et al.  A balanced approach to the multi-class imbalance problem , 2013 .

[21]  Xiangliang Zhang,et al.  Abstracting massive data for lightweight intrusion detection in computer networks , 2016, Inf. Sci..

[22]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[23]  Yi Lu Murphey,et al.  OAHO: an Effective Algorithm for Multi-Class Learning from Imbalanced Data , 2007, 2007 International Joint Conference on Neural Networks.

[24]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[25]  Sattar Hashemi,et al.  To Combat Multi-Class Imbalanced Problems by Means of Over-Sampling Techniques , 2016, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[27]  Kuan-Ching Li,et al.  Urban Traffic Coulomb’s Law: A New Approach for Taxi Route Recommendation , 2019, IEEE Transactions on Intelligent Transportation Systems.

[28]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[29]  Charlie C. L. Wang,et al.  Isogeometric computation reuse method for complex objects with topology-consistent volumetric parameterization , 2016, Comput. Aided Des..

[30]  Xiangliang Zhang,et al.  Exploring Permission-Induced Risk in Android Applications for Malicious Application Detection , 2014, IEEE Transactions on Information Forensics and Security.

[31]  Sungzoon Cho,et al.  Constructing a multi-class classifier using one-against-one approach with different binary classifiers , 2015, Neurocomputing.

[32]  Svetha Venkatesh,et al.  Learning in imbalanced relational data , 2008, 2008 19th International Conference on Pattern Recognition.

[33]  Yongdong Zhang,et al.  Adaptive weighted imbalance learning with application to abnormal activity recognition , 2016, Neurocomputing.

[34]  Lei Huang,et al.  User Behavior Analysis and Video Popularity Prediction on a Large-Scale VoD System , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[35]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[36]  Xiangliang Zhang,et al.  Detecting Android malicious apps and categorizing benign apps with ensemble of classifiers , 2018, Future Gener. Comput. Syst..

[37]  Liu Xiao,et al.  Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data , 2016 .