New imbalanced bearing fault diagnosis method based on Sample-characteristic Oversampling TechniquE (SCOTE) and multi-class LS-SVM

Abstract In actual industrial production, the historical data sets used for bearing fault diagnosis are generally limited and imbalanced and consist of multiple classes. These problems present challenges in the field of bearing fault diagnosis, for which traditional fault diagnosis methods (e.g., multi-class least squares support vector machine (multi-class LS-SVM)) are not very effective. Therefore, we propose a new multi-class imbalanced fault diagnosis method based on Sample-characteristic Oversampling Technique (SCOTE) and multi-class LS-SVM, where SCOTE is a new oversampling method proposed by us. SCOTE transforms multi-class imbalanced problems into multiple binary imbalanced problems. In each binary imbalanced problem, first, SCOTE uses the k-nearest neighbours (knn) noise processing method to filter out noisy points. Second, samples are trained by LS-SVM, and minority samples are sorted by importance according to the misclassification error of the minority classes in the training sets. Moreover, based on the importance sorting of minority samples, SCOTE performs a sample synthesis method based on the k* information nearest neighbours (k*inn) to address the binary imbalanced problems. Thus, when all the binary imbalance problems are addressed, the multi-class imbalanced problem will also be addressed. The 20 fault diagnosis examples represented by Case Western Reserve University (CWRU) bearing data and Intelligent Maintenance Systems (IMS) bearing data show that the proposed method has higher fault diagnosis recognition rates and algorithm robustness than 8 oversampling algorithms and 8 multi-class imbalanced algorithms.

[1]  Iman Nekooeimehr,et al.  Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets , 2016, Expert Syst. Appl..

[2]  Yao Hu,et al.  IA-SUWO: An Improving Adaptive semi-unsupervised weighted oversampling for imbalanced classification problems , 2020, Knowl. Based Syst..

[3]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[4]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[5]  Enrico Zio,et al.  Artificial intelligence for fault diagnosis of rotating machinery: A review , 2018, Mechanical Systems and Signal Processing.

[6]  Changyin Sun,et al.  Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data , 2015, Knowl. Based Syst..

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[8]  María Eugenia Torres,et al.  Improved complete ensemble EMD: A suitable tool for biomedical signal processing , 2014, Biomed. Signal Process. Control..

[9]  Xiaogang Wang,et al.  Distribution Adaptation and Manifold Alignment for complex processes fault diagnosis , 2018, Knowl. Based Syst..

[10]  Sungzoon Cho,et al.  Constructing a multi-class classifier using one-against-one approach with different binary classifiers , 2015, Neurocomputing.

[11]  Xin Yao,et al.  MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .

[12]  Minping Jia,et al.  A novel optimized SVM classification algorithm with multi-domain feature and its application to fault diagnosis of rolling bearing , 2018, Neurocomputing.

[13]  Hong Gu,et al.  Predicting lysine phosphoglycerylation with fuzzy SVM by incorporating k-spaced amino acid pairs into Chou׳s general PseAAC. , 2016, Journal of theoretical biology.

[14]  Nitesh V. Chawla,et al.  Building Decision Trees for the Multi-class Imbalance Problem , 2012, PAKDD.

[15]  Luis Baumela,et al.  Multi-class boosting with asymmetric binary weak-learners , 2014, Pattern Recognit..

[16]  Vasile Palade,et al.  FSVM-CIL: Fuzzy Support Vector Machines for Class Imbalance Learning , 2010, IEEE Transactions on Fuzzy Systems.

[17]  Chee Khiang Pang,et al.  Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Fernando Bação,et al.  Oversampling for Imbalanced Learning Based on K-Means and SMOTE , 2017, Inf. Sci..

[19]  Hongbo Xu,et al.  An intelligent fault identification method of rolling bearings based on LSSVM optimized by improved PSO , 2013 .

[20]  Wentao Mao,et al.  Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine , 2017 .

[21]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[22]  Francisco Herrera,et al.  IFROWANN: Imbalanced Fuzzy-Rough Ordered Weighted Average Nearest Neighbor Classification , 2015, IEEE Transactions on Fuzzy Systems.

[23]  Dewen Hu,et al.  Tracking objects using shape context matching , 2012, Neurocomputing.

[24]  Chumphol Bunkhumpornpat,et al.  DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique , 2011, Applied Intelligence.

[25]  Vladimir Vapnik,et al.  Support-vector networks , 2004, Machine Learning.

[26]  Dongyang Dou,et al.  Comparison of four direct classification methods for intelligent fault diagnosis of rotating machinery , 2016, Appl. Soft Comput..

[27]  Sheng-De Wang,et al.  Fuzzy support vector machines , 2002, IEEE Trans. Neural Networks.

[28]  Yao Hu,et al.  NI-MWMOTE: An improving noise-immunity majority weighted minority oversampling technique for imbalanced classification problems , 2020, Expert Syst. Appl..

[29]  Yao Hu,et al.  New imbalanced fault diagnosis framework based on Cluster-MWMOTE and MFO-optimized LS-SVM using limited and complex bearing data , 2020, Eng. Appl. Artif. Intell..

[30]  Zhi-Hua Zhou,et al.  Supervised nonlinear dimensionality reduction for visualization and classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Lixiang Duan,et al.  A new support vector data description method for machinery fault diagnosis with unbalanced datasets , 2016, Expert Syst. Appl..

[32]  Robert B. Randall,et al.  Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study , 2015 .

[33]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[34]  Ma Li,et al.  CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests , 2017, BMC Bioinformatics.

[35]  Chongsheng Zhang,et al.  An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme , 2018, Knowl. Based Syst..

[36]  Miriam Seoane Santos,et al.  A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients , 2015, J. Biomed. Informatics.

[37]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[38]  Xin Gao,et al.  An improved SVM integrated GS-PCA fault diagnosis approach of Tennessee Eastman process , 2016, Neurocomputing.

[39]  Daniel Morinigo-Sotelo,et al.  Early Fault Detection in Induction Motors Using AdaBoost With Imbalanced Small Data and Optimized Sampling , 2017, IEEE Transactions on Industry Applications.

[40]  Engin Avci,et al.  Speech recognition using a wavelet packet adaptive network based fuzzy inference system , 2006, Expert Syst. Appl..