A comparative study of linear regression methods in noisy environments

With the development of measurement instrumentation methods and metrology, one is very often able to rigorously specify the uncertainty associated with each measured value (e.g. concentrations, spectra, process sensors). The use of this information, along with the corresponding raw measurements, should, in principle, lead to more sound ways of performing data analysis, since the quality of data can be explicitly taken into account. This should be true, in particular, when noise is heteroscedastic and of a large magnitude. In this paper we focus on alternative multivariate linear regression methods conceived to take into account data uncertainties. We critically investigate their prediction and parameter estimation capabilities and suggest some modifications of well‐established approaches. All alternatives are tested under simulation scenarios that cover different noise and data structures. The results thus obtained provide guidelines on which methods to use and when. Interestingly enough, some of the methods that explicitly incorporate uncertainty information in their formulations tend to present not as good performances in the examples studied, whereas others that do not do so present an overall good performance. Copyright © 2005 John Wiley & Sons, Ltd.

[1]  I. Helland ON THE STRUCTURE OF PARTIAL LEAST SQUARES REGRESSION , 1988 .

[2]  E. V. Thomas,et al.  Partial least-squares methods for spectral analyses. 1. Relation to other quantitative calibration methods and the extraction of qualitative information , 1988 .

[3]  Bruce R. Kowalski,et al.  Propagation of measurement errors for the validation of predictions obtained by principal component regression and partial least squares , 1997 .

[4]  I. Helland Some theoretical aspects of partial least squares regression , 2001 .

[5]  Nicolaas M. Faber,et al.  Comparison of two recently proposed expressions for partial least squares regression prediction error , 2000 .

[6]  Agnar Höskuldsson,et al.  Prediction Methods in Science and Technology.: Vol 1. Basic theory , 1996 .

[7]  F. Xavier Rius,et al.  Lack of fit in linear regression considering errors in both axes , 2000 .

[8]  Darren T. Andrews,et al.  Maximum likelihood principal component analysis , 1997 .

[9]  Tormod Næs,et al.  Understanding the collinearity problem in regression and discriminant analysis , 2001 .

[10]  J. Riu,et al.  Evaluating bias in method comparison studies using linear regression with errors in both axes , 2002 .

[11]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[12]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[13]  Jordi Riu,et al.  Prediction intervals in linear regression taking into account errors on both axes , 2001 .

[14]  Alison J. Burnham,et al.  LATENT VARIABLE MULTIVARIATE REGRESSION MODELING , 1999 .

[15]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[16]  Shelby J. Haberman,et al.  The Analysis of Nonadditivity in Two-Way Analysis of Variance , 1990 .

[17]  Alexander Penlidis,et al.  An approach to interval estimation in partial least squares regression , 1993 .

[18]  Application of the multivariate least squares regression method to PCR and maximum likelihood PCR techniques , 2002 .

[19]  Ignacio Lira,et al.  Evaluating the Measurement Uncertainty , 2002 .

[20]  N. Sidiropoulos,et al.  Maximum likelihood fitting using ordinary least squares algorithms , 2002 .

[21]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[22]  Peter D. Wentzell,et al.  Maximum likelihood principal component analysis with correlated measurement errors: theoretical and practical considerations , 1999 .

[23]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[24]  N. Draper,et al.  Applied Regression Analysis , 1966 .

[25]  Darren T. Andrews,et al.  Maximum Likelihood Multivariate Calibration , 2022 .

[26]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[27]  Rasmus Bro,et al.  Standard error of prediction for multiway PLS 1 : background and a simulation study , 2002 .

[28]  Tormod Naes,et al.  Evaluation of alternative spectral feature extraction methods of textural images for multivariate modelling , 1998 .

[29]  Lene Theil Skovgaard,et al.  Applied regression analysis. 3rd edn. N. R. Draper and H. Smith, Wiley, New York, 1998. No. of pages: xvii+706. Price: £45. ISBN 0‐471‐17082‐8 , 2000 .

[30]  J. Riu,et al.  Assessing the accuracy of analytical methods using linear regression with errors in both axes. , 1996, Analytical chemistry.

[31]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[32]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[33]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[34]  Desire L. Massart,et al.  Estimation of partial least squares regression prediction uncertainty when the reference values carry a sizeable measurement error , 2003 .

[35]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[36]  A. Höskuldsson PLS regression methods , 1988 .