Weighting schemes for updating regression models—a theoretical approach

Abstract While multivariate calibration has been successfully employed in the monitoring of chemical processes, difficulties arise in that sensors are inherently prone to drift and processes are susceptible to unmodeled upsets. Having detected an unmodeled source of variance within new samples, the usual remedy is to update the model with additional calibration samples that contain the new chemical interferent or instrumental variation. In the event that relatively few new calibration samples are available, these new samples can be assigned higher weights by incorporating two or more copies of each when constructing the updated model. While weighting has been suggested as a means of improving prediction estimates for samples containing a new source of variance, no theoretical explanation has been provided as to why weighting is advantageous and no criteria have been proposed in selecting weights for the new calibration samples. In this paper, the utility of sample weighting is explained theoretically using both model error and leverage arguments and a leverage-based criterion for selecting weights for the new calibration samples is presented. Employing both simulated and process spectral data, a close correspondence is demonstrated between weights selected using prediction error and leverage-based criteria. Additionally, paired simulation experiments show that the reduction in prediction error achieved by sample weighting increases as the level of noise in the responses increases, suggesting that this method will be of particular value when constructing calibration models using noisy instrumental responses.

[1]  Henrik Antti,et al.  Multivariate calibration models using NIR spectroscopy on pulp and paper industrial applications , 1996 .

[2]  T. Hassard,et al.  Applied Linear Regression , 2005 .

[3]  Agnar Höskuldsson,et al.  Prediction Methods in Science and Technology.: Vol 1. Basic theory , 1996 .

[4]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[5]  Bruce R. Kowalski,et al.  Recent developments in multivariate calibration , 1991 .

[6]  S. Qin Recursive PLS algorithms for adaptive data modeling , 1998 .

[7]  J. Callis,et al.  Prediction of gasoline octane numbers from near-infrared spectral features in the range 660-1215 nm , 1989 .

[8]  K. Helland,et al.  Recursive algorithm for partial least squares regression , 1992 .

[9]  Avraham Lorber,et al.  The effect of interferences and calbiration design on accuracy: Implications for sensor and sample selection , 1988 .

[10]  Bruce R. Kowalski,et al.  Propagation of measurement errors for the validation of predictions obtained by principal component regression and partial least squares , 1997 .

[11]  S. Wold Exponentially weighted moving principal components analysis and projections to latent structures , 1994 .

[12]  B. R. Kowalski,et al.  Process Analytical Chemistry , 1988, Journal of Research of the National Bureau of Standards.

[13]  J. E. Jackson,et al.  Control Procedures for Residuals Associated With Principal Component Analysis , 1979 .

[14]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[15]  P. Williams,et al.  Near-Infrared Technology in the Agricultural and Food Industries , 1987 .

[16]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[17]  Theodora Kourti,et al.  Process analysis, monitoring and diagnosis, using multivariate projection methods , 1995 .

[18]  Bhupinder S. Dayal,et al.  Recursive exponentially weighted PLS and its applications to adaptive control and prediction , 1997 .