Process modeling by Bayesian latent variable regression

Large quantities of measured data are being routinely collected in various industries and used for extracting linear models for tasks such as process control, fault diagnosis, and process monitoring. Existing linear modeling methods, however, do not fully utilize all the information contained in the measurements. A new approach for linear process modeling makes maximum use of available process data and process knowledge. Bayesian latent-variable regression (BLVR) permits extraction and incorporation of knowledge about the statistical behavior of measurements in developing linear process models. Furthermore, BLVR can handle noise in inputs and outputs, collinear variables, and incorporate prior knowledge about regression parameters and measured variables. The model is usually more accurate than those of existing methods, including OLS, PCR, and PLS. BLVR considers a univariate output and assumes the underlying variables and noise to be Gaussian, but it can be used for multivariate outputs and other distributions. An empirical Bayes approach is developed to extract the prior information from historical data or maximum-likelihood solution of available data. Examples of steady-state, dynamic and inferential modeling demonstrate the superior accuracy of BLVR over existing methods even when the assumptions of Gaussian distributions are violated. The relationship between BLVR and existing methods and opportunities for future work based on this framework are also discussed.

[1]  Mark A. Kramer,et al.  Estimating state probability distributions from noisy and corrupted data , 1998 .

[2]  Ajit C. Tamhane,et al.  A Bayesian approach to gross error detection in chemical process data: Part II: Simulation results☆ , 1988 .

[3]  Ahmet Palazoglu,et al.  Classification of process trends based on fuzzified symbolic representation and hidden Markov models , 1998 .

[4]  W. Harmon Ray,et al.  Dynamic PLS modelling for process control , 1993 .

[5]  Sabine Van Huffel,et al.  Recent advances in total least squares techniques and errors-in-variables modeling , 1997 .

[6]  Marvin H. J. Gruber Improving Efficiency by Shrinkage: The James--Stein and Ridge Regression Estimators , 1998 .

[7]  Arnaud Doucet,et al.  Sequential Monte Carlo Methods to Train Neural Network Models , 2000, Neural Computation.

[8]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[9]  J B Kadane,et al.  Prime time for Bayes. , 1995, Controlled clinical trials.

[10]  John F. MacGregor,et al.  Modeling of dynamic systems using latent variable and subspace methods , 2000 .

[11]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[12]  J. Macgregor,et al.  Monitoring batch processes using multiway principal component analysis , 1994 .

[13]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[14]  J. Macgregor,et al.  Development of inferential process models using PLS , 1994 .

[15]  W. Krzanowski,et al.  Cross-Validatory Choice of the Number of Components From a Principal Component Analysis , 1982 .

[16]  A. Zellner An Introduction to Bayesian Inference in Econometrics , 1971 .

[17]  Sirish L. Shah,et al.  Modeling and control of multivariable processes: Dynamic PLS approach , 1997 .

[18]  Mohamed Nounou Multiscale bayesian linear modeling and applications , 2000 .

[19]  Dennis V. Lindley,et al.  Empirical Bayes Methods , 1974 .

[20]  Christos Georgakis,et al.  Disturbance detection and isolation by dynamic principal component analysis , 1995 .

[21]  Thomas F. Edgar,et al.  Robust data reconciliation and gross error detection: The modified MIMT using NLP , 1997 .

[22]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[23]  F. Guess Bayesian Statistics: Principles, Models, and Applications , 1990 .

[24]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[25]  Edward E. Leamer,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[26]  Per A. Hassel,et al.  Nonlinear partial least squares , 2003 .

[27]  A. Negiz,et al.  Statistical monitoring of multivariable dynamic processes with state-space models , 1997 .

[28]  Manabu Kano,et al.  Inferential control system of distillation compositions using dynamic partial least squares regression , 1998 .

[29]  L. Biegler,et al.  Data reconciliation and gross‐error detection for dynamic systems , 1996 .

[30]  Sabine Van Huffel,et al.  Total least squares problem - computational aspects and analysis , 1991, Frontiers in applied mathematics.

[31]  Prem K. Goel,et al.  Multiscale Bayesian rectification of data from linear steady-state and dynamic systems without accurate models , 2001 .

[32]  Thomas F. Edgar,et al.  Robust error‐in‐variables estimation using nonlinear programming techniques , 1990 .

[33]  R. McCulloch,et al.  BAYESIAN ANALYSIS OF AUTOREGRESSIVE TIME SERIES VIA THE GIBBS SAMPLER , 1994 .

[34]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[35]  J. W. Gorman,et al.  Selection of Variables for Fitting Equations to Data , 1966 .

[36]  D Malakoff,et al.  Bayes Offers a 'New' Way to Make Sense of Numbers , 1999, Science.

[37]  L. E. Wangen,et al.  A theoretical foundation for the PLS algorithm , 1987 .

[38]  Thomas E. Marlin,et al.  Multivariate statistical monitoring of process operating performance , 1991 .

[39]  Bhavik R. Bakshi,et al.  Representation of process trends—III. Multiscale extraction of trends from process data , 1994 .

[40]  W. Massy Principal Components Regression in Exploratory Statistical Research , 1965 .

[41]  D. S. Sivia,et al.  Data Analysis , 1996, Encyclopedia of Evolutionary Psychological Science.

[42]  J. R. Whiteley,et al.  Knowledge-based interpretation of sensor patterns , 1992 .

[43]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[44]  S. Skogestad,et al.  Estimation of distillation compositions from multiple temperature measurements using partial-least-squares regression , 1991 .

[45]  Raghunathan Rengaswamy,et al.  A syntactic pattern-recognition approach for process monitoring and fault diagnosis , 1995 .

[46]  Nicholas T. Carnevale,et al.  Expanding NEURON's Repertoire of Mechanisms with NMODL , 2000, Neural Computation.

[47]  J. Macgregor,et al.  MULTIVARIATE IDENTIFICATION: A STUDY OF SEVERAL METHODS , 1992 .

[48]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[49]  Herman Wold,et al.  Soft modelling: The Basic Design and Some Extensions , 1982 .

[50]  Ajit C. Tamhane,et al.  A Bayesian approach to gross error detection in chemical process data: Part I : Model development , 1988 .

[51]  T. Fearn,et al.  Bayesian statistics : principles, models, and applications , 1990 .

[52]  Minseok Kim,et al.  Rule-based reactive rescheduling system for multi-purpose batch processes , 1997 .

[53]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[54]  Patrick Dewilde,et al.  Subspace model identification Part 1. The output-error state-space model identification class of algorithms , 1992 .

[55]  S. Vajda,et al.  An extended Marquardt-type procedure for fitting error-in-variables models , 1987 .

[56]  Jürgen Pilz,et al.  Bayesian estimation and experimental design in linear regression models , 1992 .

[57]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[58]  S. Wold Nonlinear partial least squares modelling II. Spline inner relation , 1992 .

[59]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[60]  M. Stone Continuum regression: Cross-validated sequentially constructed prediction embracing ordinary least s , 1990 .

[61]  B. Bakshi,et al.  Bayesian principal component analysis , 2002 .