Robust Multivariate Methods in Chemometrics

This chapter presents an introduction to robust statistics with applications of a chemometric nature. Following a description of the basic ideas and concepts behind robust statistics, including how robust estimators can be conceived, the chapter builds up to the construction (and use) of robust alternatives for some methods for multivariate analysis frequently used in chemometrics, such as principal component analysis and partial least squares. The chapter then provides an insight into how these robust methods can be used or extended to classification. To conclude, the issue of validation of the results is being addressed: it is shown how uncertainty statements associated with robust estimates, can be obtained.

[1]  D. G. Simpson,et al.  Robust principal component analysis for functional data , 2007 .

[2]  J RousseeuwPeter,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[3]  H. P. Lopuhaä On the relation between S-estimators and M-estimators of multivariate location and covariance , 1989 .

[4]  P. Filzmoser,et al.  Algorithms for Projection-Pursuit Robust Principal Component Analysis , 2007 .

[5]  P. Espen,et al.  Identification of micro-organisms by dint of the electronic nose and trilinear partial least squares regression , 2004 .

[6]  Mia Hubert,et al.  ROBPCA: A New Approach to Robust Principal Component Analysis , 2005, Technometrics.

[7]  Victor J. Yohai,et al.  The Behavior of the Stahel-Donoho Robust Multivariate Estimator , 1995 .

[8]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[9]  Pascal Lemberge,et al.  Quantitative analysis of 16–17th century archaeological glass vessels using PLS regression of EPXMA and µ‐XRF data , 2000 .

[10]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[11]  Stefan Van Aelst,et al.  MULTIVARIATE REGRESSION S-ESTIMATORS FOR ROBUST ESTIMATION AND INFERENCE , 2005 .

[12]  Peter Filzmoser,et al.  Partial robust M-regression , 2005 .

[13]  Robin Sibson,et al.  What is projection pursuit , 1987 .

[14]  Peter Filzmoser,et al.  Introduction to Multivariate Statistical Analysis in Chemometrics , 2009 .

[15]  S. Keleş,et al.  Sparse partial least squares regression for simultaneous dimension reduction and variable selection , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[16]  R. Fisher THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS , 1938 .

[17]  K. Baumann,et al.  A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part II. Practical applications , 2002 .

[18]  P. L. Davies,et al.  Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices , 1987 .

[19]  Mia Hubert,et al.  Fast and robust discriminant analysis , 2004, Comput. Stat. Data Anal..

[20]  Michel Tenenhaus,et al.  PLS path modeling , 2005, Comput. Stat. Data Anal..

[21]  Sven Serneels,et al.  Calculation of PLS prediction intervals using efficient recursive relations for the Jacobian matrix , 2004 .

[22]  田中 豊,et al.  Principal Component Analysis for Functional Data , 2001 .

[23]  Eric R. Ziegel,et al.  Tsukuba Meeting: Largest Attendance Ever , 2004, Technometrics.

[24]  Guoying Li,et al.  Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo , 1985 .

[25]  Christophe Croux,et al.  TOMCAT: A MATLAB toolbox for multivariate calibration techniques , 2007 .

[26]  Youngjo Lee,et al.  Sparse partial least-squares regression and its applications to high-throughput data analysis , 2011 .

[27]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[28]  C. Caulin,et al.  Oncogenic regulation and function of keratins 8 and 18 , 1996, Cancer and Metastasis Reviews.

[29]  G. Fleuren,et al.  Expression of keratin 19 distinguishes papillary thyroid carcinoma from follicular carcinomas and follicular thyroid adenoma. , 1989, American journal of clinical pathology.

[30]  M. Hubert,et al.  Robust methods for partial least squares regression , 2003 .

[31]  K. Baumann,et al.  A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part I. Search algorithm, theory and simulations , 2002 .

[32]  David J. Cummins,et al.  Iteratively reweighted partial least squares: A performance analysis by monte carlo simulation , 1995 .

[33]  Michael C. Denham,et al.  Prediction intervals in partial least squares , 1997 .

[34]  Rosario Romera,et al.  On robust partial least squares (PLS) methods , 1998 .

[35]  J. Alexander,et al.  Theory and Methods: Critical Essays in Human Geography , 2008 .

[36]  Hendrik P. Lopuhaä,et al.  Highly efficient estimators of multivariate location with high breakdown point , 1992 .

[37]  H. Oja,et al.  Sign and rank covariance matrices , 2000 .

[38]  C. Croux,et al.  Principal Component Analysis Based on Robust Estimators of the Covariance or Correlation Matrix: Influence Functions and Efficiencies , 2000 .

[39]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[40]  R. Zamar,et al.  Bootstrapping robust estimates of regression , 2002 .

[41]  Tormod Næs,et al.  A user-friendly guide to multivariate calibration and classification , 2002 .

[42]  K. Janssens,et al.  Composition of 15-17th century archaeological glass vessels excavated in Antwerp, Belgium , 1998 .

[43]  Mia Hubert,et al.  The influence function of the Stahel–Donoho covariance estimator of smallest outlyingness , 2009 .

[44]  PETER J. ROUSSEEUW,et al.  Computing LTS Regression for Large Data Sets , 2005, Data Mining and Knowledge Discovery.

[45]  P. L. Davies,et al.  Breakdown and groups , 2005, math/0508497.

[46]  Christophe Croux,et al.  Influence properties of partial least squares regression , 2004 .

[47]  Stephane Heritier,et al.  Robust Methods in Biostatistics , 2009 .

[48]  D. Ruppert Robust Statistics: The Approach Based on Influence Functions , 1987 .

[49]  Peter Filzmoser,et al.  Robust Multivariate Methods: The Projection Pursuit Approach , 2005, GfKl.

[50]  Michiel Debruyne,et al.  THE INFLUENCE FUNCTION OF STAHEL-DONOHO TYPE METHODS FOR ROBUST COVARIANCE ESTIMATION AND PCA , 2006 .

[51]  Christophe Croux,et al.  A Fast Algorithm for Robust Principal Components Based on Projection Pursuit , 1996 .

[52]  Stefan Van Aelst,et al.  Fast and robust bootstrap for LTS , 2005, Comput. Stat. Data Anal..

[53]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[54]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[55]  Peter Filzmoser,et al.  Robust continuum regression , 2005 .

[56]  Mia Hubert,et al.  Fast model selection for robust calibration methods , 2005 .

[57]  M. Hubert,et al.  A fast method for robust principal components with applications to chemometrics , 2002 .

[58]  V. Yohai,et al.  Robust Statistics: Theory and Methods , 2006 .

[59]  S. Frosch Møller,et al.  Robust methods for multivariate data analysis , 2005 .

[60]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[61]  Ricardo A. Maronna,et al.  Principal Components and Orthogonal Regression Based on Robust Scales , 2005, Technometrics.

[62]  Mia Hubert,et al.  LIBRA: a MATLAB library for robust analysis , 2005 .

[63]  Sven Serneels,et al.  Spatial Sign Preprocessing: A Simple Way To Impart Moderate Robustness to Multivariate Estimators , 2006, J. Chem. Inf. Model..

[64]  Lutgarde M. C. Buydens,et al.  Strategy for constructing robust multivariate calibration models , 1999 .

[65]  A. Garrido-Varo,et al.  Optimization of Discriminant Partial Least Squares Regression Models for the Detection of Animal By-Product Meals in Compound Feedingstuffs by Near-Infrared Spectroscopy , 2006, Applied spectroscopy.

[66]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[67]  Vasil Simeonov,et al.  A comparison between two robust PCA algorithms , 2004 .

[68]  Douglas M. Hawkins,et al.  High-Breakdown Linear Discriminant Analysis , 1997 .

[69]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[70]  W. Fung,et al.  High Breakdown Estimation for Multiple Populations with Applications to Discriminant Analysis , 2000 .

[71]  Philippe Besse,et al.  Statistical Applications in Genetics and Molecular Biology A Sparse PLS for Variable Selection when Integrating Omics Data , 2011 .

[72]  Sven Serneels,et al.  Bootstrap confidence intervals for trilinear partial least squares regression , 2005 .

[73]  H. Kiers,et al.  Bootstrap confidence intervals for three‐way methods , 2004 .

[74]  Christophe Croux,et al.  Implementing the Bianco and Yohai estimator for logistic regression , 2003, Comput. Stat. Data Anal..