A recursively updated Map-Reduce based PCA for monitoring the time-varying fluorochemical engineering processes with big data

Abstract The hypertoxic materials in fluorochemical engineering process make it very critical to public safety. Unfortunately, the complicated and time-varying characteristics of it badly hinder the wide applications of advanced process monitoring strategies, especially the big data techniques, in its monitoring system. Therefore, a recursively updated Map-Reduce based principle component analysis, RMPCA, was proposed. A variable-width bin histogram was proposed to speed up the corresponding mutual information (MI) calculating procedure, which was used to informatively split variables into smaller pieces for running on the Map-Reduce framework. Afterwards, a forgetting factor strategy was proposed to recursively update the distributed PCA model, Bayesian decision fusion and hierarchical fault diagnosis scheme with new data for further monitoring. Applications on a practical R-22, a common propellant and refrigerant, producing process and on the Tennessee Eastman process strongly confirmed the superiority of RMPCA in both fault detection and diagnosis for time-varying processes with big data.

[1]  Jef Vanlaer,et al.  Contribution plots for Statistical Process Control: Analysis of the smearing-out effect , 2013, 2013 European Control Conference (ECC).

[2]  Zhiqiang Ge,et al.  Nonlinear feature extraction for soft sensor modeling based on weighted probabilistic PCA , 2015 .

[3]  Fan Yang,et al.  Recursive Slow Feature Analysis for Adaptive Monitoring of Industrial Processes , 2018, IEEE Transactions on Industrial Electronics.

[4]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .

[5]  Zhiqiang Ge,et al.  Double locally weighted principal component regression for soft sensor with sample selection under supervised latent structure , 2016 .

[6]  E. F. Vogel,et al.  A plant-wide industrial process control problem , 1993 .

[7]  Zhiqiang Ge,et al.  Distributed Parallel PCA for Modeling and Monitoring of Large-Scale Plant-Wide Processes With Big Data , 2017, IEEE Transactions on Industrial Informatics.

[8]  Biao Huang,et al.  Review and Perspectives of Data-Driven Distributed Monitoring for Industrial Plant-Wide Processes , 2019, Industrial & Engineering Chemistry Research.

[9]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[10]  Mohieddine Jelali,et al.  Revision of the Tennessee Eastman Process Model , 2015 .

[11]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[12]  Weihua Li,et al.  Recursive PCA for adaptive process monitoring , 1999 .

[13]  S. Qin Recursive PLS algorithms for adaptive data modeling , 1998 .

[14]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[15]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[16]  Yuanling Lin,et al.  Monitoring Nonstationary Dynamic Systems Using Cointegration and Common-Trends Analysis , 2017 .

[17]  Shu Jing Zhou,et al.  Review of Genetic Algorithm , 2011 .

[18]  Hui Cheng,et al.  Local–Global Modeling and Distributed Computing Framework for Nonlinear Plant-Wide Process Monitoring With Industrial Big Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Zhiqiang Ge,et al.  Review on data-driven modeling and monitoring for plant-wide industrial processes , 2017 .

[20]  Okyay Kaynak,et al.  Big Data for Modern Industry: Challenges and Trends [Point of View] , 2015, Proc. IEEE.

[21]  Chunhui Zhao,et al.  Stationarity test and Bayesian monitoring strategy for fault detection in nonlinear multimode processes , 2017 .

[22]  Steven X. Ding,et al.  A Review on Basic Data-Driven Approaches for Industrial Process Monitoring , 2014, IEEE Transactions on Industrial Electronics.

[23]  Pramod K. Varshney,et al.  Mutual information-based CT-MR brain image registration using generalized partial volume joint histogram estimation , 2003, IEEE Transactions on Medical Imaging.

[24]  Chunhui Zhao,et al.  Recursive cointegration analytics for adaptive monitoring of nonstationary industrial processes with both static and dynamic variations , 2020 .

[25]  Plant-Wide Industrial Process Monitoring: A Distributed Modeling Framework , 2016, IEEE Transactions on Industrial Informatics.

[26]  Chunhui Zhao,et al.  Fault-relevant Principal Component Analysis (FPCA) method for multivariate statistical modeling and process monitoring , 2014 .

[27]  Age K. Smilde,et al.  Generalized contribution plots in multivariate statistical process monitoring , 2000 .

[28]  Si-Zhao Joe Qin,et al.  Reconstruction-based contribution for process monitoring , 2009, Autom..

[29]  Kai Song,et al.  A novel nonlinear adaptive Mooney-viscosity model based on DRPLS-GP algorithm for rubber mixing process , 2012 .

[30]  Furong Gao,et al.  Data-Driven Batch-End Quality Modeling and Monitoring Based on Optimized Sparse Partial Least Squares , 2020, IEEE Transactions on Industrial Electronics.

[31]  Michael J. Piovoso,et al.  On unifying multiblock analysis with application to decentralized process monitoring , 2001 .