论文信息 - Dealing with missing data in MSPC: several methods, different interpretations, some examples

Dealing with missing data in MSPC: several methods, different interpretations, some examples

This paper addresses the problem of using future multivariate observations with missing data to estimate latent variable scores from an existing principal component analysis (PCA) model. This is a critical issue in multivariate statistical process control (MSPC) schemes where the process is continuously interrogated based on an underlying PCA model. We present several methods for estimating the scores of new individuals with missing data: a so‐called trimmed score method (TRI), a single‐component projection method (SCP), a method of projection to the model plane (PMP), a method based on the iterative imputation of missing data, a method based on the minimization of the squared prediction error (SPE), a conditional mean replacement method (CMR) and various least squared‐based methods: one based on a regression on known data (KDR) and the other based on a regression on trimmed scores (TSR). The basis for each method and the expressions for the score estimators, their covariance matrices and the estimation errors are developed. Some of the methods discussed have already been proposed in the literature (SCP, PMP and CMR), some are original (TRI and TSR) and others are shown to be equivalent to methods already developed by other authors: iterative imputation and SPE methods are equivalent to PMP; KDR is equivalent to CMR. These methods can be seen as different ways to impute values for the missing variables. The efficiency of the methods is studied through simulations based on an industrial data set. The KDR method is shown to be statistically superior to the other methods, except the TSR method in which the matrix to be inverted is of a much smaller size. Copyright © 2002 John Wiley & Sons, Ltd.

A. Ferrer | F. Arteaga | Francisco Arteaga

[1] Elo Harald Hansen,et al. New nitrate ion-selective electrodes based on quaternary ammonium compounds in nonporous polymer membranes , 1976 .

[2] S. Wold. Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[3] B. Kowalski,et al. Partial least-squares regression: a tutorial , 1986 .

[4] Paul Geladi,et al. Principal Component Analysis , 1987, Comprehensive Chemometrics.

[5] Theodora Kourti,et al. Multivariate SPC Methods for Process and Product Monitoring , 1996 .

[6] P. A. Taylor,et al. Missing data methods in PCA and PLS: Score calculations with incomplete observations , 1996 .

[7] R. Manne,et al. Missing values in principal component analysis , 1998 .

[8] Rimon Barr,et al. User Guide and Tutorial , 2003 .