Robust Bayesian networks for low-quality data modeling and process monitoring applications

Abstract In this paper, a novel robust Bayesian network is proposed for process modeling with low-quality data. Since unreliable data can cause model parameters to deviate from the real distributions and make network structures unable to characterize the true causalities, data quality feature is utilized to improve the process modeling and monitoring performance. With a predetermined trustworthy center, the data quality measurement results can be evaluated through an exponential function with Mahalanobis distances. The conventional Bayesian network learning algorithms including structure learning and parameter learning are modified by the quality feature in a weighting form, intending to extract useful information and make a reasonable model. The effectiveness of the proposed method is demonstrated through TE benchmark process and a real industrial process.

[1]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2]  Michel Verleysen,et al.  Mixtures of robust probabilistic principal component analyzers , 2008, ESANN.

[3]  Joon S. Lim,et al.  Replace Missing Values with EM algorithm based on GMM and Naïve Bayesian , 2014 .

[4]  Zhiqiang Ge,et al.  Process Data Analytics via Probabilistic Latent Variable Models: A Tutorial Review , 2018, Industrial & Engineering Chemistry Research.

[5]  David B. Dunson,et al.  Robust and Scalable Bayes via a Median of Subset Posterior Measures , 2014, J. Mach. Learn. Res..

[6]  Zhiqiang Ge,et al.  Weighted Linear Dynamic System for Feature Representation and Soft Sensor Application in Nonlinear Dynamic Industrial Processes , 2018, IEEE Transactions on Industrial Electronics.

[7]  Zhiqiang Ge,et al.  Large-scale plant-wide process modeling and hierarchical monitoring: A distributed Bayesian network approach , 2017 .

[8]  Zhiqiang Ge,et al.  Deep Learning of Semisupervised Process Data With Hierarchical Extreme Learning Machine and Soft Sensor Application , 2018, IEEE Transactions on Industrial Electronics.

[9]  Zhiqiang Ge,et al.  Adaptive soft sensors for quality prediction under the framework of Bayesian network , 2018 .

[10]  Min Xie,et al.  A Dynamic-Bayesian-Network-Based Fault Diagnosis Methodology Considering Transient and Intermittent Faults , 2017, IEEE Transactions on Automation Science and Engineering.

[11]  Faisal Khan,et al.  Root Cause Diagnosis of Process Fault Using KPCA and Bayesian Network , 2017 .

[12]  Biao Huang,et al.  Two layered mixture Bayesian probabilistic PCA for dynamic process monitoring , 2017 .

[13]  Lei Xie,et al.  Structured sequential Gaussian graphical models for monitoring time-varying process , 2019 .

[14]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[15]  Zhiqiang Ge,et al.  Data Mining and Analytics in the Process Industry: The Role of Machine Learning , 2017, IEEE Access.

[16]  Guiwu Wei,et al.  Similarity measures of Pythagorean fuzzy sets based on the cosine function and their applications , 2018, Int. J. Intell. Syst..

[17]  Hui Liu,et al.  A new hybrid method for learning bayesian networks: Separation and reunion , 2017, Knowl. Based Syst..

[18]  Lei Huang,et al.  Bayesian Networks in Fault Diagnosis , 2017, IEEE Transactions on Industrial Informatics.

[19]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[20]  Xuefeng Yan,et al.  Parallel PCA–KPCA for nonlinear process monitoring , 2018, Control Engineering Practice.

[21]  Yangyong Zhu,et al.  The Challenges of Data Quality and Data Quality Assessment in the Big Data Era , 2015, Data Sci. J..

[22]  Biao Huang,et al.  Process monitoring using kernel density estimation and Bayesian networking with an industrial case study. , 2015, ISA transactions.

[23]  Steven X. Ding,et al.  A Review on Basic Data-Driven Approaches for Industrial Process Monitoring , 2014, IEEE Transactions on Industrial Electronics.

[24]  Zhiqiang Ge,et al.  Distributed predictive modeling framework for prediction and diagnosis of key performance index in plant-wide processes , 2017 .

[25]  Zhiqiang Ge,et al.  Distributed Parallel PCA for Modeling and Monitoring of Large-Scale Plant-Wide Processes With Big Data , 2017, IEEE Transactions on Industrial Informatics.

[26]  Zhiqiang Ge,et al.  Nonlinear Gaussian Mixture Regression for Multimode Quality Prediction With Partially Labeled Data , 2019, IEEE Transactions on Industrial Informatics.

[27]  Zhiqiang Ge,et al.  Review and big data perspectives on robust data mining approaches for industrial process modeling with outliers and missing data , 2018, Annu. Rev. Control..

[28]  Weiyi Liu,et al.  A Parallel and Incremental Approach for Data-Intensive Learning of Bayesian Networks , 2015, IEEE Transactions on Cybernetics.

[29]  Le Yao,et al.  Nonlinear probabilistic latent variable regression models for soft sensor application: From shallow to deep structure , 2020 .

[30]  Weiwen Peng,et al.  Reliability analysis of complex multi-state system with common cause failure based on evidential networks , 2018, Reliab. Eng. Syst. Saf..

[31]  Tao Chen,et al.  Robust probabilistic PCA with missing data and contribution analysis for outlier detection , 2009, Comput. Stat. Data Anal..

[32]  Biao Huang,et al.  Review and Perspectives of Data-Driven Distributed Monitoring for Industrial Plant-Wide Processes , 2019, Industrial & Engineering Chemistry Research.

[33]  Dexian Huang,et al.  Generalized grouped contributions for hierarchical fault diagnosis with group Lasso , 2019 .

[34]  Matej Oresic,et al.  Self-organization and missing values in SOM and GTM , 2015, Neurocomputing.

[35]  Yuan-Fang Wang,et al.  Learning a Mahalanobis Distance-Based Dynamic Time Warping Measure for Multivariate Time Series Classification , 2016, IEEE Transactions on Cybernetics.

[36]  Chunhua Yang,et al.  Nonlinear process monitoring using kernel dictionary learning with application to aluminum electrolysis process , 2019, Control Engineering Practice.

[37]  Mark J. Nixon,et al.  Data cleaning in the process industries , 2015 .

[38]  Anders L. Madsen,et al.  A parallel algorithm for Bayesian network structure learning from large data sets , 2017, Knowl. Based Syst..

[39]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[40]  Sirish L. Shah,et al.  Treatment of missing values in process data analysis , 2008 .

[41]  Zhiqiang Ge,et al.  Review on data-driven modeling and monitoring for plant-wide industrial processes , 2017 .