Feature Selection Based on the Kullback-Leibler Distance and its Application on Fault Diagnosis

The core concept of pattern recognition is that digs inner mode between data in the same class. The within-class data has a similar distribution, while between-class data has some distinction in different forms. Feature selection utilizes the difference between two-class data to reduce the number of features in the training models. A large amount of feature selection methods have widely used in different fields. This paper proposes a novel feature selection method based on the Kullback-Leibler distance which measures the distance of distribution between two features. For fault diagnosis, the proposed feature selection method is combined with support vector machine to improve its performance. Experimental results validate the effectiveness and superior of the proposed feature selection method, and the proposed diagnosis model can increase the detection rate in chemistry process.

[1]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[2]  Ping Zhang,et al.  A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process , 2012 .

[3]  John R. Hershey,et al.  Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[4]  Kui Zhang,et al.  Feature selection for high-dimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks , 2011, Neurocomputing.

[5]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[6]  Minh N. Do,et al.  Wavelet-based texture retrieval using generalized Gaussian density and Kullback-Leibler distance , 2002, IEEE Trans. Image Process..

[7]  Thomas W. Rauber,et al.  Heterogeneous Feature Models and Feature Selection Applied to Bearing Fault Diagnosis , 2015, IEEE Transactions on Industrial Electronics.

[8]  M. Varanasi,et al.  Parametric generalized Gaussian density estimation , 1989 .

[9]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[10]  Xiangjian He,et al.  Building an Intrusion Detection System Using a Filter-Based Feature Selection Algorithm , 2016, IEEE Transactions on Computers.

[11]  Feng-Chia Li,et al.  Combination of feature selection approaches with SVM in credit scoring , 2010, Expert Syst. Appl..

[12]  Claude Delpha,et al.  An optimal fault detection threshold for early detection using Kullback-Leibler Divergence for unknown distribution data , 2016, Signal Process..

[13]  Sirish L. Shah,et al.  Fault detection and diagnosis in process data using one-class support vector machines , 2009 .

[14]  Li Zhang,et al.  Kernel sparse representation-based classifier ensemble for face recognition , 2013, Multimedia Tools and Applications.

[15]  Yun-jie Xu,et al.  A New and Effective Method of Bearing Fault Diagnosis Using Wavelet Packet Transform Combined with Support Vector Machine , 2011, J. Comput..

[16]  Steven Kay,et al.  Fundamentals Of Statistical Signal Processing , 2001 .

[17]  Xun Wang,et al.  Detecting abnormal situations using the Kullback-Leibler divergence , 2014, Autom..