Adaptive Decision Threshold-Based Extreme Learning Machine for Classifying Imbalanced Multi-label Data

Multi-label learning is a popular area of machine learning research as it is widely applicable to many real-world scenarios. In comparison with traditional binary and multi-classification tasks, the multi-label data are more easily impacted or destroyed by an imbalanced data distribution. This paper describes an adaptive decision threshold-based extreme learning machine algorithm (ADT-ELM) that addresses the imbalanced multi-label data classification problem. Specifically, the macro and micro F-measure metrics are adopted as the optimization functions for ADT-ELM, and the particle swarm optimization algorithm is employed to determine the optimal decision threshold combination. We use the optimized thresholds to make decision for future multi-label instances. Twelve baseline multi-label data sets are used in a series of experiments o verify the effectiveness and superiority of the proposed algorithm. The experimental results indicate that the proposed ADT-ELM algorithm is significantly superior to many state-of-the-art multi-label imbalance learning algorithms, and it generally requires less training time than more sophisticated algorithms.

[1]  Changyin Sun,et al.  Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data , 2015, Knowl. Based Syst..

[2]  Jun Feng,et al.  Extreme Learning Machine for Multi-Label Classification , 2016, Entropy.

[3]  Kay Chen Tan,et al.  Evolutionary Cluster-Based Synthetic Oversampling Ensemble (ECO-Ensemble) for Imbalance Learning , 2017, IEEE Transactions on Cybernetics.

[4]  Josef Kittler,et al.  Inverse random under sampling for class imbalance problem and its application to multi-label classification , 2012, Pattern Recognit..

[5]  Swagatam Das,et al.  Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs , 2015, Neural Networks.

[6]  Francisco Herrera,et al.  Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data , 2015, Fuzzy Sets Syst..

[7]  Francisco Charte,et al.  MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation , 2015, Knowl. Based Syst..

[8]  Xu-Ying Liu,et al.  Towards Class-Imbalance Aware Multi-Label Learning , 2015, IEEE Transactions on Cybernetics.

[9]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[10]  Drazen Prelec,et al.  A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data , 2018, Neurocomputing.

[11]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[12]  Adel Nadjaran Toosi,et al.  Artificial fish swarm algorithm: a survey of the state-of-the-art, hybridization, combinatorial and indicative applications , 2012, Artificial Intelligence Review.

[13]  Lei Tang,et al.  Large scale multi-label classification via metalabeler , 2009, WWW '09.

[14]  Haizhou Li,et al.  A Cost-Sensitive Deep Belief Network for Imbalanced Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[15]  D. Karaboga,et al.  On the performance of artificial bee colony (ABC) algorithm , 2008, Appl. Soft Comput..

[16]  Hualong Yu,et al.  Estimating harmfulness of class imbalance by scatter matrix based class separability measure , 2014, Intell. Data Anal..

[17]  José Ramón Quevedo,et al.  Multilabel classifiers with a probabilistic thresholding strategy , 2012, Pattern Recognit..

[18]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[19]  Changyin Sun,et al.  AL-ELM: One uncertainty-based active learning algorithm using extreme learning machine , 2015, Neurocomputing.

[20]  Yuming Zhou,et al.  A novel ensemble method for classifying imbalanced data , 2015, Pattern Recognit..

[21]  Ruiyun Yu,et al.  Multi-label classification methods for green computing and application for mobile medical recommendations , 2016, IEEE Access.

[22]  Qi Wang,et al.  Fuzzy One-Class Extreme Auto-encoder , 2018, Neural Processing Letters.

[23]  Jun Ni,et al.  An Improved Ensemble Learning Method for Classifying High-Dimensional and Imbalanced Biomedicine Data , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Xiaochun Cao,et al.  Joint Optic Disc and Cup Segmentation Based on Multi-Label Deep Network and Polar Transformation , 2018, IEEE Transactions on Medical Imaging.

[25]  Francisco Herrera,et al.  Evolutionary-based selection of generalized instances for imbalanced classification , 2012, Knowl. Based Syst..

[26]  MengChu Zhou,et al.  A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification , 2017, IEEE Transactions on Cybernetics.

[27]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[28]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.

[29]  Jia Xu,et al.  Extreme learning machines: new trends and applications , 2014, Science China Information Sciences.

[30]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[31]  Michel Ballings,et al.  Evaluating multi-label classifiers and recommender systems in the financial service sector , 2019, Eur. J. Oper. Res..

[32]  Francisco Charte,et al.  Addressing imbalance in multilabel classification: Measures and random resampling algorithms , 2015, Neurocomputing.

[33]  Changyin Sun,et al.  Fuzzy Support Vector Machine With Relative Density Information for Classifying Imbalanced Data , 2019, IEEE Transactions on Fuzzy Systems.

[34]  Hamido Fujita,et al.  Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates , 2018, Inf. Sci..

[35]  Dursun Delen,et al.  A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets , 2018, Decis. Support Syst..

[36]  Changyin Sun,et al.  ODOC-ELM: Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data , 2016, Knowl. Based Syst..

[37]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[38]  Taghi M. Khoshgoftaar,et al.  RUSBoost: A Hybrid Approach to Alleviating Class Imbalance , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[39]  Kuo-Chen Chou,et al.  iATC‐mISF: a multi‐label classifier for predicting the classes of anatomical therapeutic chemicals , 2016, Bioinform..

[40]  Mohamed Abdelrazek,et al.  An Ensemble Oversampling Model for Class Imbalance Problem in Software Defect Prediction , 2018, IEEE Access.

[41]  Ping Zhong,et al.  Least Squares Fuzzy One-class Support Vector Machine for Imbalanced Data , 2015 .

[42]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[43]  Hongming Zhou,et al.  Extreme Learning Machine for Regression and Multiclass Classification , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[44]  Jing Zhao,et al.  ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.

[45]  Xin Yao,et al.  Resampling-Based Ensemble Methods for Online Class Imbalance Learning , 2015, IEEE Transactions on Knowledge and Data Engineering.

[46]  Timothy N. Rubin,et al.  Statistical topic models for multi-label document classification , 2011, Machine Learning.

[47]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[48]  Changyin Sun,et al.  LW-ELM: A Fast and Flexible Cost-Sensitive Learning Framework for Classifying Imbalanced Data , 2018, IEEE Access.

[49]  Shao-Yuan Li,et al.  Multi-Label Learning from Crowds , 2019, IEEE Transactions on Knowledge and Data Engineering.