Pattern and Knowledge Extraction using Process Data Analytics: A Tutorial

Abstract Traditional techniques employed by control engineers require a significant update in order to handle the increasing complexity of modern processes. Conveniently, advances in statistical machine learning and distributed computation have led to an abundance of techniques suitable for advanced analysis. In this tutorial we introduce data analytics techniques and discuss their theory and application to chemical processes. Although the focus is more on theory, the applications will be explored more widely in a follow-up journal paper. The ultimate goal is to familiarize control engineers with how these techniques are used to extract valuable knowledge from raw data, which can then be utilized to make smarter process control decisions.

[1]  Vikramaditya R. Jakkula,et al.  Tutorial on Support Vector Machine ( SVM ) , 2011 .

[2]  Biao Huang,et al.  System Identification , 2000, Control Theory for Physicists.

[3]  Biao Huang,et al.  Expectation–Maximization Approach to Fault Diagnosis With Missing Data , 2015, IEEE Transactions on Industrial Electronics.

[4]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[5]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[6]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[9]  Thomas B. Schön,et al.  Particle Filter Approach to Nonlinear System Identification under Missing Observations with a Real Application , 2009 .

[10]  Wallace E. Larimore,et al.  Statistical optimality and canonical variate analysis system identification , 1996, Signal Process..

[11]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[12]  Paul Geladi,et al.  Analysis of multi-way (multi-mode) data , 1989 .

[13]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[14]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[15]  Bruce R. Kowalski,et al.  Partial least-squares path modelling with latent variables , 1979 .

[16]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[17]  Gilles Celeux,et al.  Bayesian Inference for Mixture: The Label Switching Problem , 1998, COMPSTAT.

[18]  Biao Huang,et al.  On simultaneous on-line state and parameter estimation in non-linear state-space models , 2013 .

[19]  Barak A. Pearlmutter,et al.  G-maximization: An unsupervised learning procedure for discovering regularities , 1987 .

[20]  Matti Pietikäinen,et al.  Supervised Locally Linear Embedding , 2003, ICANN.

[21]  Klaus-Robert Müller,et al.  Introduction to machine learning for brain imaging , 2011, NeuroImage.

[22]  Thomas J. McAvoy,et al.  Nonlinear FIR Modeling via a Neural Net PLS Approach , 1996 .

[23]  Cenk Undey,et al.  Machine-learning for biopharmaceutical batch process monitoring with limited data , 2018 .

[24]  J. Macgregor,et al.  Monitoring batch processes using multiway principal component analysis , 1994 .

[25]  R. Bhushan Gopaluni,et al.  Particle filtering without tears: A primer for beginners , 2016, Comput. Chem. Eng..

[26]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[27]  Donghua Zhou,et al.  Quality Relevant Data-Driven Modeling and Monitoring of Multivariate Dynamic Processes: The Dynamic T-PLS Approach , 2011, IEEE Transactions on Neural Networks.