Fault detection in batch processes through variable selection integrated to multiway principal component analysis

Abstract The main purpose of fault detection in batch process monitoring is to identify batches displaying atypical behavior in comparison to normal operating data. The current growth in the number of measurable variables due to process automation yields datasets in which the number of variables is much larger than the number of batches. That may compromise the performance of Multiway Principal Component Analysis (MPCA), which is the most popular quality control approach used in batch processes. To overcome that, new strategies to handle high-dimensional datasets become necessary. In this paper we propose the Pareto Variable Selection (PVS) – MPCA method to monitor batch processes described by high-dimensional datasets. The main idea of PVS-MPCA is to select process variables that promote the best classification of production batches in conforming or non-conforming classes, prior to the construction of T 2 and Q control charts used to monitor batch performance. Our proposition was applied to a real dataset from a chocolate conching batch operation and compared to classical MPCA-based monitoring. PVS-MPCA promoted a reduction of 85.18% in false alarm rate retaining only 5 unfolded variables, in opposition to 2,864 unfolded variables used in classical MPCA. The missed detection rate was null, ensuring that only conforming batches were released to the production line.

[1]  Theodora Kourti,et al.  Statistical Process Control of Multivariate Processes , 1994 .

[2]  Douglas C. Montgomery,et al.  A review of multivariate control charts , 1995 .

[3]  Yuan Yao,et al.  Multivariate fault isolation of batch processes via variable selection in partial least squares discriminant analysis. , 2017, ISA transactions.

[4]  Giovanna Capizzi,et al.  Recent Advances in Process Monitoring: Nonparametric and Variable-Selection Methods for Phase I and Phase II , 2015 .

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  Alberto Ferrer,et al.  Batch process diagnosis: PLS with variable selection versus block-wise PCR , 2004 .

[7]  A. C. Rencher Methods of multivariate analysis , 1995 .

[8]  Douglas C. Montgomery,et al.  Some Current Directions in the Theory and Application of Statistical Process Monitoring , 2014 .

[9]  Theodora Kourti,et al.  Multivariate dynamic data modeling for analysis and statistical process control of batch processes, start‐ups and grade transitions , 2003 .

[10]  Ya-Ju Fan,et al.  On the Time Series $K$-Nearest Neighbor Classification of Abnormal Brain Activity , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[11]  Chonghun Han,et al.  Fault Detection and Operation Mode Identification Based on Pattern Classification with Variable Selection , 2004 .

[12]  Furong Gao,et al.  Statistical analysis and online monitoring for handling multiphase batch processes with varying durations , 2011 .

[13]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[14]  Stefan H. Steiner,et al.  An Overview of Phase I Analysis for Process Improvement and Monitoring , 2014 .

[15]  R. Simpson,et al.  Early recognition of problematic wine fermentations through multivariate data analyses , 2012 .

[16]  W. Art Chaovalitwongse,et al.  Multicriteria variable selection for classification of production batches , 2012, Eur. J. Oper. Res..

[17]  Youxian Sun,et al.  Step-wise sequential phase partition (SSPP) algorithm based statistical modeling and online process monitoring , 2013 .

[18]  Kaushik Ghosh,et al.  Optimal variable selection for effective statistical process monitoring , 2014, Comput. Chem. Eng..

[19]  J. Macgregor,et al.  Monitoring batch processes using multiway principal component analysis , 1994 .

[20]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[21]  D. Ballabio,et al.  Classification tools in chemistry. Part 1: linear models. PLS-DA , 2013 .

[22]  Yuan Yao,et al.  Multivariate fault isolation via variable selection in discriminant analysis , 2015 .

[23]  William Rea,et al.  How Many Components should be Retained from a Multivariate Time Series PCA , 2016 .

[24]  L. P. L. Oliveira,et al.  Monitoring batch processes with an incomplete set of variables , 2018 .

[25]  Søren Bisgaard,et al.  The Future of Quality Technology: From a Manufacturing to a Knowledge Economy & From Defects to Innovations , 2012 .

[26]  Murat Kulahci,et al.  Real-time fault detection and diagnosis using sparse principal component analysis , 2017, Journal of Process Control.

[27]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[28]  Gabriel Maciá-Fernández,et al.  Evaluation of diagnosis methods in PCA-based Multivariate Statistical Process Control , 2018 .

[29]  V. Glicerina,et al.  Effect of manufacturing process on the microstructural and rheological properties of milk chocolate , 2015 .

[30]  Abbas Khosravi,et al.  Classification of sags gathered in distribution substations based on multiway principal component analysis , 2009 .

[31]  S. Bolenz,et al.  Using extra dry milk ingredients for accelerated conching of milk chocolate , 2008 .

[32]  S. Bolenz,et al.  Fast conching for milk chocolate , 2003 .

[33]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[34]  Alberto Ferrer,et al.  Real-time synchronization of batch trajectories for on-line multivariate statistical process control using Dynamic Time Warping , 2011 .

[35]  John F. MacGregor,et al.  Multivariate SPC charts for monitoring batch processes , 1995 .

[36]  Karlene A. Kosanovich,et al.  Improved Process Understanding Using Multiway Principal Component Analysis , 1996 .

[37]  Chunhui Zhao,et al.  Sequential Time Slice Alignment Based Unequal-Length Phase Identification and Modeling for Fault Detection of Irregular Batches , 2015 .

[38]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[39]  Flavio Sanson Fogliatto,et al.  Variable selection methods in multivariate statistical process control: A systematic literature review , 2018, Comput. Ind. Eng..

[40]  Wang Wei,et al.  Efficient faulty variable selection and parsimonious reconstruction modeling for fault diagnosis , 2015, CCC 2015.

[41]  I. Johnstone,et al.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions , 2009, Journal of the American Statistical Association.

[42]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[43]  Biao Huang,et al.  Distributed monitoring for large-scale processes based on multivariate statistical analysis and Bayesian method , 2016 .

[44]  V. Glicerina,et al.  Rheological, textural and calorimetric modifications of dark chocolate during process , 2013 .

[45]  John F. MacGregor,et al.  Multi-way partial least squares in monitoring batch processes , 1995 .

[46]  Di Tang,et al.  Fault Detection and Diagnosis Based on Sparse PCA and Two-Level Contribution Plots , 2017 .

[47]  A. Smilde,et al.  Dynamic time warping of spectroscopic BATCH data , 2003 .

[48]  Wei Jiang,et al.  High-Dimensional Process Monitoring and Fault Isolation via Variable Selection , 2009 .

[49]  Lv Zhaomin,et al.  Batch Process Monitoring Based on Multisubspace Multiway Principal Component Analysis and Time-Series Bayesian Inference , 2014 .

[50]  P. A. Taylor,et al.  Synchronization of batch trajectories using dynamic time warping , 1998 .

[51]  S. Barringer,et al.  Effects of conching time and ingredients on preference of milk chocolate. , 2009 .

[52]  Flávio Sanson Fogliatto,et al.  Strategies for synchronizing chocolate conching batch process data using dynamic time warping , 2019, Journal of Food Science and Technology.

[53]  Jianqing Fan,et al.  Asymptotics of empirical eigenstructure for high dimensional spiked covariance. , 2017, Annals of statistics.

[54]  R. Boqué,et al.  Calculation of the reliability of classification in discriminant partial least-squares binary classification , 2009 .

[55]  Mark Fowler,et al.  Relationship between rheological, textural and melting properties of dark chocolate as influenced by particle size distribution and composition , 2008 .

[56]  Donald A. Jackson STOPPING RULES IN PRINCIPAL COMPONENTS ANALYSIS: A COMPARISON OF HEURISTICAL AND STATISTICAL APPROACHES' , 1993 .

[57]  A. J. Morris,et al.  Manufacturing performance enhancement through multivariate statistical process control , 1999 .