Support vector data description for machinery multi-fault classification with unbalanced datasets

In mechanical fault diagnosis area, fault samples are often difficult to obtain, so the number of fault samples is far less than that of normal samples which leads to the unbalanced dataset issues. A novel model combining SVDD (Support Vector Data Description) and binary tree (BT) based on Mahalanobis distance is put forward to address the multi-classification problems under unbalanced datasets. The idea of the proposed method is to divide the original samples into a series of subsets by adopting binary tree, and then build classifier by describing the boundary of the target via SVDD. The proposed method has emphatically studied on: 1) Separability measure based on Mahalanobis distance. It represents the separability degree which takes the unbalanced degree and distance between each class into account, and takes the advantages of considering the relations among all the features of the datasets by the definition of Mahalanobis distance, it is helpful to determine the structure of the binary tree. 2) Train classifiers by using SVDD. Choose the target class according to the order of binary tree. The proposed method can be applied to multi-classification problems with unbalanced datasets issues. To validate this methodology, samples from unbalanced rotor are employed for experiment. Then, the experimental result compared with other methods is presented showing that the proposed methodology has a better performance and higher classification accuracy on multi-classification problems under unbalanced datasets.

[1]  Yu Li,et al.  Mahalanobis distance based on fuzzy clustering algorithm for image segmentation , 2015, Digit. Signal Process..

[2]  Jun-Geol Baek,et al.  Density weighted support vector data description , 2014, Expert Syst. Appl..

[3]  Héctor Allende,et al.  Dual Support Vector Domain Description for Imbalanced Classification , 2012, ICANN.

[4]  Sotirios Chatzis,et al.  A Fuzzy Clustering Approach Toward Hidden Markov Random Field Models for Enhanced Spatially Constrained Image Segmentation , 2008, IEEE Transactions on Fuzzy Systems.

[5]  Tingting Mu,et al.  Multiclass Classification Based on Extended Support Vector Data Description , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[6]  Markus Timusk,et al.  Fault detection in variable speed machinery: Statistical parameterization , 2009 .

[7]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[8]  Markus Timusk,et al.  Feature extraction for novelty detection as applied to fault detection in machinery , 2011, Pattern Recognit. Lett..

[9]  Jung-Hsien Chiang,et al.  A new maximal-margin spherical-structured multi-class support vector machine , 2009, Applied Intelligence.

[10]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jun Tang,et al.  An Enhanced Opposition-Based Particle Swarm Optimization , 2009, 2009 WRI Global Congress on Intelligent Systems.

[12]  Daewon Lee,et al.  Domain described support vector classifier for multi-classification problems , 2007, Pattern Recognit..

[13]  Lei Yang,et al.  New Multi-class Classification Method Based on the SVDD Model , 2011, ISNN.

[14]  Xueying Zhang,et al.  Robust support vector data description for outlier detection with noise or uncertain data , 2015, Knowl. Based Syst..

[15]  Changshui Zhang,et al.  Solving one-class problem with outlier examples by SVM , 2015, Neurocomputing.