New imbalanced fault diagnosis framework based on Cluster-MWMOTE and MFO-optimized LS-SVM using limited and complex bearing data

Abstract Due to the complexity of their working conditions, historical rolling bearing datasets are mostly limited and imbalanced. The fault data may be composed of multiple subclusters; that is, the historical rolling bearing data have both between-class and within-class imbalances. While support vector machines (e.g., least squares support vector machines (LS-SVMs)) offer advantages when dealing with limited data, traditional fault diagnosis using an LS-SVM has the disadvantages of easy failure of complex imbalanced data and large dependence on the classifier hyperparameters. Therefore, this paper presents a new imbalanced fault diagnosis framework based on a cluster-majority weighted minority oversampling technique (Cluster-MWMOTE) and a moth-flame optimization (MFO)-based LS-SVM classifier. As an extension of MWMOTE, our proposed Cluster-MWMOTE combines the clustering algorithm represented by agglomerative hierarchical clustering (AHC) with MWMOTE. Unlike MWMOTE, Cluster-MWMOTE can avoid the ignoring of small subclusters of faulty (minority) instances far from normal (majority) instances. That is, Cluster-MWMOTE further improves the adaptation to within-class imbalances. As a novel heuristic intelligent algorithm, MFO exhibits faster convergence and higher precision than the traditional optimization algorithms (e.g., particle swarm optimization (PSO) and genetic algorithm (GA)). Therefore, we utilize MFO to optimize the hyperparameters (Sigma & γ ) of the LS-SVM classifier for the first time. The fault diagnosis results represented by CWRU and IMS bearing data suggest that the proposed framework provides higher fault diagnosis recognition rates and algorithm robustness than 16 existing algorithms.

[1]  Iman Nekooeimehr,et al.  Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets , 2016, Expert Syst. Appl..

[2]  Michael Pecht,et al.  A Local Adaptive Minority Selection and Oversampling Method for Class-Imbalanced Fault Diagnostics in Industrial Systems , 2020, IEEE Transactions on Reliability.

[3]  Dalia Yousri,et al.  Parameters extraction of the three diode model for the multi-crystalline solar cell/module using Moth-Flame Optimization Algorithm , 2016 .

[4]  Francisco Herrera,et al.  IFROWANN: Imbalanced Fuzzy-Rough Ordered Weighted Average Nearest Neighbor Classification , 2015, IEEE Transactions on Fuzzy Systems.

[5]  Chumphol Bunkhumpornpat,et al.  Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem , 2009, PAKDD.

[6]  David Mackay,et al.  Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks , 1995 .

[7]  Jun Wang,et al.  An automatic and robust features learning method for rotating machinery fault diagnosis based on contractive autoencoder , 2018, Eng. Appl. Artif. Intell..

[8]  Yijing Li,et al.  Learning from class-imbalanced data: Review of methods and applications , 2017, Expert Syst. Appl..

[9]  Michal Daszykowski,et al.  Revised DBSCAN algorithm to cluster data with dense adjacent clusters , 2013 .

[10]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[11]  Enrico Zio,et al.  Artificial intelligence for fault diagnosis of rotating machinery: A review , 2018, Mechanical Systems and Signal Processing.

[12]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[13]  Dong Wang,et al.  K-nearest neighbors based methods for identification of different gear crack levels under different motor speeds and loads: Revisited , 2016 .

[14]  Mohd Salman Leong,et al.  Dempster-Shafer evidence theory for multi-bearing faults diagnosis , 2017, Eng. Appl. Artif. Intell..

[15]  Xin Gao,et al.  An improved SVM integrated GS-PCA fault diagnosis approach of Tennessee Eastman process , 2016, Neurocomputing.

[16]  Lin Wang,et al.  Machine learning based mobile malware detection using highly imbalanced network traffic , 2017, Inf. Sci..

[17]  Luca Podofillini,et al.  Comparing the treatment of uncertainty in Bayesian networks and fuzzy expert systems used for a human reliability analysis application , 2015, Reliab. Eng. Syst. Saf..

[18]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[19]  David A. Cieslak,et al.  Combating imbalance in network intrusion datasets , 2006, 2006 IEEE International Conference on Granular Computing.

[20]  María Eugenia Torres,et al.  Improved complete ensemble EMD: A suitable tool for biomedical signal processing , 2014, Biomed. Signal Process. Control..

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[22]  Zhi-Hua Zhou,et al.  Learning Imbalanced Multi-class Data with Optimal Dichotomy Weights , 2013, 2013 IEEE 13th International Conference on Data Mining.

[23]  Engin Avci,et al.  Speech recognition using a wavelet packet adaptive network based fuzzy inference system , 2006, Expert Syst. Appl..

[24]  Anil K. Jain Data clustering: 50 years beyond K-means , 2010, Pattern Recognit. Lett..

[25]  Hai Qiu,et al.  Wavelet filter-based weak signature detection method and its application on rolling element bearing prognostics , 2006 .

[26]  Lihui Wang,et al.  Imbalanced data fault diagnosis of rotating machinery using synthetic oversampling and feature learning , 2018, Journal of Manufacturing Systems.

[27]  Fei Liu,et al.  Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Wu Lifeng,et al.  Fractional Hausdorff grey model and its properties , 2020 .

[29]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[30]  Chee Khiang Pang,et al.  Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Lixiang Duan,et al.  A new support vector data description method for machinery fault diagnosis with unbalanced datasets , 2016, Expert Syst. Appl..

[32]  Nitesh V. Chawla,et al.  Building Decision Trees for the Multi-class Imbalance Problem , 2012, PAKDD.

[33]  Cunbin Li,et al.  A least squares support vector machine model optimized by moth-flame optimization algorithm for annual power load forecasting , 2016, Applied Intelligence.

[34]  Hamdan Daniyal,et al.  Optimal reactive power dispatch solution by loss minimization using moth-flame optimization technique , 2017, Appl. Soft Comput..

[35]  Aboul Ella Hassanien,et al.  Moth-flame optimization for training Multi-Layer Perceptrons , 2015, 2015 11th International Computer Engineering Conference (ICENCO).

[36]  Jianwei Yang,et al.  A Hybrid Approach for Fault Diagnosis of Railway Rolling Bearings Using STWD-EMD-GA-LSSVM , 2016 .

[37]  Yao Hu,et al.  NI-MWMOTE: An improving noise-immunity majority weighted minority oversampling technique for imbalanced classification problems , 2020, Expert Syst. Appl..

[38]  Patrick Flandrin,et al.  A complete ensemble empirical mode decomposition with adaptive noise , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[39]  Hamido Fujita,et al.  Multi-Imbalance: An open-source software for multi-class imbalance learning , 2019, Knowl. Based Syst..

[40]  Hui Zhang,et al.  Fault Diagnosis of an Autonomous Vehicle With an Improved SVM Algorithm Subject to Unbalanced Datasets , 2021, IEEE Transactions on Industrial Electronics.

[41]  Seyed Mohammad Mirjalili,et al.  Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm , 2015, Knowl. Based Syst..

[42]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[43]  Enrico Zio,et al.  Feature vector regression with efficient hyperparameters tuning and geometric interpretation , 2016, Neurocomputing.

[44]  Fernando Bação,et al.  Oversampling for Imbalanced Learning Based on K-Means and SMOTE , 2017, Inf. Sci..

[45]  Zhi-Hua Zhou,et al.  Supervised nonlinear dimensionality reduction for visualization and classification , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Daniel Morinigo-Sotelo,et al.  Early Fault Detection in Induction Motors Using AdaBoost With Imbalanced Small Data and Optimized Sampling , 2017, IEEE Transactions on Industry Applications.

[47]  Wentao Mao,et al.  Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine , 2017 .

[48]  Robert B. Randall,et al.  Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study , 2015 .

[49]  Sabine Van Huffel,et al.  Classification of Ovarian Tumors Using Bayesian Least Squares Support Vector Machines , 2003, AIME.

[50]  Minping Jia,et al.  A novel optimized SVM classification algorithm with multi-domain feature and its application to fault diagnosis of rolling bearing , 2018, Neurocomputing.

[51]  Yaguo Lei,et al.  Deep normalized convolutional neural network for imbalanced fault classification of machinery and its understanding via visualization , 2018, Mechanical Systems and Signal Processing.

[52]  Xueying Zhang,et al.  Robust support vector data description for outlier detection with noise or uncertain data , 2015, Knowl. Based Syst..

[53]  Xiaogang Wang,et al.  Distribution Adaptation and Manifold Alignment for complex processes fault diagnosis , 2018, Knowl. Based Syst..

[54]  Fionn Murtagh,et al.  Algorithms for hierarchical clustering: an overview , 2012, WIREs Data Mining Knowl. Discov..

[55]  Ma Li,et al.  CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests , 2017, BMC Bioinformatics.

[56]  Franck Dufrenois,et al.  A One-Class Kernel Fisher Criterion for Outlier Detection , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[57]  Chumphol Bunkhumpornpat,et al.  DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique , 2011, Applied Intelligence.

[58]  Markus Timusk,et al.  Feature extraction for novelty detection as applied to fault detection in machinery , 2011, Pattern Recognit. Lett..

[59]  Hongbo Xu,et al.  An intelligent fault identification method of rolling bearings based on LSSVM optimized by improved PSO , 2013 .

[60]  Johan A. K. Suykens,et al.  Least squares support vector machine classifiers: a large scale algorithm , 1999 .

[61]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[62]  Xin Yao,et al.  MWMOTE--Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning , 2014 .

[63]  Yang Wang,et al.  Boosting for Learning Multiple Classes with Imbalanced Class Distribution , 2006, Sixth International Conference on Data Mining (ICDM'06).