Multi-rate Gaussian Bayesian network soft sensor development with noisy input and missing data

Abstract For efficient process control and monitoring, accurate real-time information of quality variables is essential. To predict these quality (or slow-rate) variables at a fast-rate, in the industry, inferential/soft sensors are often used. However, most of the conventional methods for soft sensors do not utilize prior process knowledge even if it is available. The prediction accuracy of these inferential sensors depends mainly on the quality of available data, which can be affected by significant noise and possible sensor failures. To address these issues, in this work, a generic Gaussian Bayesian network based soft-sensor framework is developed, which can account multiple hidden states and multirate/missing data. In the proposed framework, due to the presence of hidden variables and missing data, posterior probability of these variables in E-step of the EM algorithm is evaluated using Bayesian inference. Compared to the existing soft-sensors, the proposed approach will allow users to integrate prior knowledge into the BN structure. Moreover, due to the probabilistic nature of BNs, variances of measurement noises and disturbances between hidden states are simultaneously estimated. The proposed framework is generic and can be used for any multi-layered structure. Its performance is demonstrated for two different structures, two-layer and multilayered structures, on a benchmark flow-network problem and an industrial process. It is observed that the proposed Gaussian Bayesian network based soft sensors are able to give significantly better and more reliable estimates compared to the conventional approaches.

[1]  Qiang Shen,et al.  Learning Bayesian networks: approaches and issues , 2011, The Knowledge Engineering Review.

[2]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.

[3]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[4]  Weiming Shao,et al.  Nonlinear industrial soft sensor development based on semi-supervised probabilistic mixture of extreme learning machines , 2019, Control Engineering Practice.

[5]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[6]  Bogdan Gabrys,et al.  Soft sensors: Where are we and what are the current and future challenges? , 2009, ICONS.

[7]  Fang Min,et al.  Estimating bayesian networks parameters using EM and Gibbs sampling , 2017 .

[8]  Bogdan Gabrys,et al.  Review of adaptation mechanisms for data-driven soft sensors , 2011, Comput. Chem. Eng..

[9]  Y. Z. Friedman,et al.  First-principles distillation inference models for product quality prediction : Clean fuels , 2002 .

[10]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[11]  Alfred Stein,et al.  Application of the EM-algorithm for Bayesian Network Modelling to Improve Forest Growth Estimates , 2011 .

[12]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[13]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[14]  Biao Huang,et al.  A Bayesian approach to design of adaptive multi-model inferential sensors with application in oil sand industry , 2012 .

[15]  D. Margaritis Learning Bayesian Network Model Structure from Data , 2003 .

[16]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[17]  Nikolaos V. Sahinidis,et al.  A combined first-principles and data-driven approach to model building , 2015, Comput. Chem. Eng..

[18]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[19]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[20]  Lyle H. Ungar,et al.  A first principles approach to automated troubleshooting of chemical plants , 1990 .

[21]  Kuangrong Hao,et al.  Supervised Variational Autoencoders for Soft Sensor Modeling With Missing Data , 2020, IEEE Transactions on Industrial Informatics.

[22]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[23]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[24]  T. Moon The expectation-maximization algorithm , 1996, IEEE Signal Process. Mag..

[25]  Carmine Zoccali,et al.  Multiple imputation: dealing with missing data. , 2013, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[26]  Biao Huang,et al.  Output-relevant Variational autoencoder for Just-in-time soft sensor modeling with missing data , 2020 .

[27]  Bogdan Gabrys,et al.  Data-driven Soft Sensors in the process industry , 2009, Comput. Chem. Eng..

[28]  R. B. Gopaluni,et al.  Modern Machine Learning Tools for Monitoring and Control of Industrial Processes: A Survey , 2022, IFAC-PapersOnLine.

[29]  Zainal Ahmad,et al.  Modelling and control of different types of polymerization processes using neural networks technique: A review , 2010 .

[30]  Dimitri Lefebvre,et al.  Soft sensor design and fault detection using Bayesian network and probabilistic principal component analysis , 2019, Journal of Advanced Manufacturing and Processing.

[31]  Alfred Stein,et al.  Bayesian Network Modeling for Improving Forest Growth Estimates , 2011, IEEE Transactions on Geoscience and Remote Sensing.

[32]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.

[33]  Le Yao,et al.  Nonlinear probabilistic latent variable regression models for soft sensor application: From shallow to deep structure , 2020 .

[34]  Sirish L. Shah,et al.  Treatment of missing values in process data analysis , 2008 .

[35]  Mohamed Ali Mahjoub,et al.  Tutorial and Selected Approaches on Parameter Learning in Bayesian Network with Incomplete Data , 2012, ISNN.

[36]  Giti Esmaily. Radvar Practical issues in non-linear system identification , 2002 .

[37]  Sankaran Mahadevan,et al.  Efficient approximate inference in Bayesian networks with continuous variables , 2018, Reliab. Eng. Syst. Saf..

[38]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[39]  Kevin P. Murphy,et al.  An introduction to graphical models , 2011 .

[40]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[41]  José Manuel Gutiérrez,et al.  Who learns better Bayesian network structures: Accuracy and speed of structure learning algorithms , 2018, Int. J. Approx. Reason..

[42]  Biao Huang,et al.  A review of the Expectation Maximization algorithm in data-driven process identification , 2019, Journal of Process Control.

[43]  Rafael Rumí,et al.  A Review of Inference Algorithms for Hybrid Bayesian Networks , 2018, J. Artif. Intell. Res..

[44]  H. Abdi,et al.  Principal component analysis , 2010 .

[45]  Zhiqiang Ge,et al.  Adaptive soft sensors for quality prediction under the framework of Bayesian network , 2018 .

[46]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[47]  Mats G. Gustafsson,et al.  A Probabilistic Derivation of the Partial Least-Squares Algorithm , 2001, J. Chem. Inf. Comput. Sci..

[48]  S. A. Itken Learning Bayesian networks: approaches and issues , 2011 .

[49]  Biao Huang,et al.  Dealing with Irregular Data in Soft Sensors: Bayesian Method and Comparative Study , 2008 .