A sparse PCA for nonlinear fault diagnosis and robust feature discovery of industrial processes

Pearson’s correlation measure is only able to model linear dependence between random variables. Hence, conventional principal component analysis (PCA) based on Pearson’s correlation measure is not suitable for application to modern industrial processes where process variables are often nonlinearly related. To address this problem, a non-parametric PCA model is proposed based on nonlinear correlation measures, including Spearman’s and Kendall tau’s rank correlation. These two correlation measures are also less sensitive to outliers comparing to Pearson’s correlation, making the proposed PCA a robust feature extraction technique. To reveal meaningful patterns from process data, a generalized iterative deflation method is applied to the robust correlation matrix of the process data to sequentially extract a set of leading sparse pseudo-eigenvectors. For online fault diagnosis, the T2 and SPE statistics are computed and analyzed with respect to the subspace spanned by the extracted pseudo-eigenvectors. The proposed method is applied to two industrial case studies. Its process monitoring performance is demonstrated to be superior to that of the conventional PCA and is comparable to those of Kernel PCA and kernel independent component analysis (KICA) at a lower computational cost. The proposed PCA is also more robust in sparse feature extraction from contaminated process data.

[1]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[2]  In-Beum Lee,et al.  Fault Detection of Non-Linear Processes Using Kernel Independent Component Analysis , 2008 .

[3]  S. D. Jong,et al.  The kernel PCA algorithms for wide data. Part I: Theory and algorithms , 1997 .

[4]  Fang Han,et al.  High Dimensional Semiparametric Scale-Invariant Principal Component Analysis , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  C. Granger,et al.  USING THE MUTUAL INFORMATION COEFFICIENT TO IDENTIFY LAGS IN NONLINEAR MODELS , 1994 .

[6]  Emmanuel J. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[7]  C. Yoo,et al.  Nonlinear process monitoring using kernel principal component analysis , 2004 .

[8]  Vikram Garaniya,et al.  Modified independent component analysis and Bayesian network-based two-stage fault diagnosis of process operations , 2015 .

[9]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[10]  Dirk P. Kroese,et al.  Kernel density estimation via diffusion , 2010, 1011.2602.

[11]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[12]  Hongyuan Zha,et al.  Low-Rank Approximations with Sparse Factors II: Penalized Methods with Discrete Newton-Like Iterations , 2004, SIAM J. Matrix Anal. Appl..

[13]  M. Borgognone,et al.  Principal component analysis in sensory analysis: covariance or correlation matrix? , 2001 .

[14]  Catherine Dehon,et al.  Influence functions of the Spearman and Kendall correlation measures , 2010, Stat. Methods Appl..

[15]  ChangKyoo Yoo,et al.  Statistical process monitoring with independent component analysis , 2004 .

[16]  Peter Grassberger,et al.  Lower bounds on mutual information. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Nina F. Thornhill,et al.  A continuous stirred tank heater simulation model with applications , 2008 .

[18]  Hongyuan Zha,et al.  Low-Rank Approximations with Sparse Factors I: Basic Algorithms and Error Analysis , 2001, SIAM J. Matrix Anal. Appl..

[19]  Barry Lennox,et al.  Monitoring a complex refining process using multivariate statistics , 2008 .

[20]  E. F. Vogel,et al.  A plant-wide industrial process control problem , 1993 .

[21]  Sam T. Roweis,et al.  EM Algorithms for PCA and SPCA , 1997, NIPS.

[22]  S. Joe Qin,et al.  Statistical process monitoring: basics and beyond , 2003 .

[23]  Y. Saad Projection and deflation method for partial pole assignment in linear state feedback , 1988 .

[24]  Christophe Croux,et al.  The Gaussian rank correlation estimator: robustness properties , 2010, Statistics and Computing.

[25]  In-Beum Lee,et al.  Fault detection and diagnosis based on modified independent component analysis , 2006 .

[26]  Manabu Kano,et al.  Comparison of multivariate statistical process monitoring methods with applications to the Eastman challenge problem , 2002 .

[27]  Terrence J. Sejnowski,et al.  ICA Mixture Models for Unsupervised Classification of Non-Gaussian Classes and Automatic Context Switching in Blind Signal Separation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..