Bayesian principal component analysis

Principal component analysis (PCA) is a dimensionality reduction modeling technique that transforms a set of process variables by rotating their axes of representation. Maximum likelihood PCA (MLPCA) is an extension that accounts for different noise contributions in each variable. Neither PCA nor any of its extensions utilizes external information about the model or data, such as the range or distribution of the underlying measurements. Such prior information can be extracted from measured data and can be used to greatly enhance the model accuracy. This paper develops a Bayesian PCA (BPCA) modeling algorithm that improves the accuracy of estimating the parameters and measurements by incorporating prior knowledge about the data and model. The proposed approach integrates modeling and feature extraction by simultaneously solving parameter estimation and data reconciliation optimization problems. Methods for estimating the prior parameters from available data are discussed. Furthermore, BPCA reduces to PCA or MLPCA when a uniform prior is used. Several examples illustrate the benefits of BPCA versus existing methods even when the measurements violate the assumptions about their distribution. Copyright © 2002 John Wiley & Sons, Ltd.

[1]  Christos Georgakis,et al.  Disturbance detection and isolation by dynamic principal component analysis , 1995 .

[2]  Prem K. Goel,et al.  Process modeling by Bayesian latent variable regression , 2002 .

[3]  Darren T. Andrews,et al.  Maximum likelihood principal component analysis , 1997 .

[4]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[5]  Q E Whiting-O'Keefe,et al.  Controlled clinical trials. , 1983, The American journal of medicine.

[6]  Lloyd P. M. Johnston,et al.  Maximum likelihood data rectification: Steady-state systems , 1995 .

[7]  S. J. Press,et al.  Applied multivariate analysis : using Bayesian and frequentist methods of inference , 1984 .

[8]  Catherine Porte,et al.  Automation and optimization of glycine synthesis , 1996 .

[9]  J. E. Jackson,et al.  Statistical Factor Analysis and Related Methods: Theory and Applications , 1995 .

[10]  M. West,et al.  Bayesian forecasting and dynamic models , 1989 .

[11]  Charles H. Lochmüller,et al.  Introduction to Factor Analysis , 1998 .

[12]  R. P. McDonald,et al.  Bayesian estimation in unrestricted factor analysis: A treatment for heywood cases , 1975 .

[13]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[14]  A. Basilevsky Statistical Factor Analysis and Related Methods: Theory and Applications , 1994 .

[15]  S. J. Press,et al.  Bayesian Inference in Factor Analysis , 1989 .

[16]  David E. Booth,et al.  Applied Multivariate Analysis , 2003, Technometrics.

[17]  Alexander Basilevsky,et al.  Statistical Factor Analysis and Related Methods , 1994 .

[18]  W. Krzanowski,et al.  Cross-Validatory Choice of the Number of Components From a Principal Component Analysis , 1982 .

[19]  Allan R. Sampson,et al.  Contributions to Probability and Statistics : Essays in Honor of Ingram Olkin , 1992 .

[20]  Marvin H. J. Gruber Improving Efficiency by Shrinkage: The James--Stein and Ridge Regression Estimators , 1998 .

[21]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[22]  S. J. Press,et al.  Robustness of bayesian factor analysis estimates , 1998 .

[23]  S. Wold Exponentially weighted moving principal components analysis and projections to latent structures , 1994 .

[24]  Dennis V. Lindley,et al.  Empirical Bayes Methods , 1974 .

[25]  D Malakoff,et al.  Bayes Offers a 'New' Way to Make Sense of Numbers , 1999, Science.

[26]  Prem K. Goel,et al.  Multiscale Bayesian rectification of data from linear steady-state and dynamic systems without accurate models , 2001 .

[27]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[28]  B. Bakshi Multiscale PCA with application to multivariate statistical process monitoring , 1998 .

[29]  M. A. Girshick On the Sampling Theory of Roots of Determinantal Equations , 1939 .

[30]  Thomas E. Marlin,et al.  Multivariate statistical monitoring of process operating performance , 1991 .

[31]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[32]  J B Kadane,et al.  Prime time for Bayes. , 1995, Controlled clinical trials.

[33]  Manabu Kano,et al.  Inferential control system of distillation compositions using dynamic partial least squares regression , 2000 .

[34]  A. Sampson,et al.  Contributions to Probability and Statistics , 1989 .

[35]  A. Rukhin Bayes and Empirical Bayes Methods for Data Analysis , 1997 .