High-Breakdown Inference for Mixed Linear Models

Mixed linear models are used to analyze data in many settings. These models have a multivariate normal formulation in most cases. The maximum likelihood estimator (MLE) or the residual MLE (REML) is usually chosen to estimate the parameters. However, the latter are based on the strong assumption of exact multivariate normality. Welsh and Richardson have shown that these estimators are not robust to small deviations from multivariate normality. This means that in practice a small proportion of data (even only one) can drive the value of the estimates on their own. Because the model is multivariate, we propose a high-breakdown robust estimator for very general mixed linear models that include, for example, covariates. This robust estimator belongs to the class of S-estimators, from which we can derive asymptotic properties for inference. We also use it as a diagnostic tool to detect outlying subjects. We discuss the advantages of this estimator compared with other robust estimators proposed previously and illustrate its performance with simulation studies and analysis of three datasets. We also consider robust inference for multivariate hypotheses as an alternative to the classical F-test by using a robust score-type test statistic proposed by Heritier and Ronchetti, and study its properties through simulations and analysis of real data.

[1]  C. G. Khatri,et al.  A note on a manova model applied to problems in growth curve , 1966 .

[2]  Ellen R. Girden,et al.  ANOVA: Repeated Measures , 1991 .

[3]  C. Jennison,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[4]  H. P. Lopuhaä On the relation between S-estimators and M-estimators of multivariate location and covariance , 1989 .

[5]  D. Bates,et al.  Newton-Raphson and EM Algorithms for Linear Mixed-Effects Models for Repeated-Measures Data , 1988 .

[6]  Ruben H. Zamar,et al.  Robust Estimates of Location and Dispersion for High-Dimensional Datasets , 2002, Technometrics.

[7]  Elvezio Ronchetti,et al.  A Robust Version of Mallows's C P , 1994 .

[8]  J. Schmee Matrices with Applications in Statistics , 1982 .

[9]  J. Segui,et al.  Semantic and Associative Priming in Picture Naming , 2000, The Quarterly journal of experimental psychology. A, Human experimental psychology.

[10]  Maria-Pia Victoria-Feser,et al.  Fast Algorithms for Computing High Breakdown Covariance Matrices with Missing Data , 2004 .

[11]  Elvezio Ronchetti,et al.  Robustness Aspects of Model Choice , 1997 .

[12]  Bell Telephone,et al.  ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA , 1972 .

[13]  Colin L. Mallows,et al.  Some Comments on Cp , 2000, Technometrics.

[14]  M. Victoria-Feser,et al.  High-breakdown estimation of multivariate mean and covariance with missing observations. , 2002, The British journal of mathematical and statistical psychology.

[15]  Maria-Pia Victoria-Feser,et al.  High Breakdown Inference in the Mixed Linear Model , 2005 .

[16]  H. Daniels Saddlepoint Approximations in Statistics , 1954 .

[17]  V. Yohai,et al.  Robust Estimation of Multivariate Location and Scatter , 2006 .

[18]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[19]  E. Ronchetti,et al.  Robust Bounded-Influence Tests in General Parametric Models , 1994 .

[20]  A. Welsh,et al.  Robust Restricted Maximum Likelihood in Mixed Linear Models , 1995 .

[21]  Richard M. Huggins,et al.  Variance components models for dependent cell populations , 1994 .

[22]  A. Kuk,et al.  Robust estimation in generalized linear mixed models , 2002 .

[23]  J. John Recovery of inter-block information , 1987 .

[24]  H. Braun,et al.  Testing in robust anova , 1981 .

[25]  David M. Rocke,et al.  Computable Robust Estimation of Multivariate Location and Shape in High Dimension Using Compound Estimators , 1994 .

[26]  C. R. Rao,et al.  Linear Statistical Inference and its Applications , 1968 .

[27]  Werner A. Stahel,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[28]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[29]  Peter J. Rousseeuw,et al.  ROBUST REGRESSION BY MEANS OF S-ESTIMATORS , 1984 .

[30]  Elvezio Ronchetti,et al.  Saddlepoint approximations and tests based on multivariate M-estimates , 2003 .

[31]  Alice Richardson,et al.  13 Approaches to the robust estimation of mixed models , 1997 .

[32]  J. Wellmann Robustness of an S-Estimator in the One-Way Random Effects Model , 2000 .

[33]  William H. Fellner,et al.  Robust Estimation of Variance Components , 1986 .

[34]  Richard Huggins,et al.  ON THE ROBUST ANALYSIS OF VARIANCE COMPONENTS MODELS FOR PEDIGREE DATA , 1993 .

[35]  P. Rousseeuw Least Median of Squares Regression , 1984 .

[36]  Jeremy MG Taylor,et al.  Robust Statistical Modeling Using the t Distribution , 1989 .

[37]  N. Laird,et al.  Maximum likelihood computations with repeated measures: application of the EM algorithm , 1987 .

[38]  Sanjoy K. Sinha,et al.  Robust Analysis of Generalized Linear Mixed Models , 2004 .

[39]  C. Mallows More comments on C p , 1995 .

[40]  P. Rousseeuw,et al.  Breakdown Points of Affine Equivariant Estimators of Multivariate Location and Covariance Matrices , 1991 .

[41]  P. L. Davies,et al.  Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices , 1987 .

[42]  R. Kirk Experimental Design: Procedures for the Behavioral Sciences , 1970 .

[43]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[44]  P. Holcomb,et al.  Event-Related Brain Potentials Reflect Semantic Priming in an Object Decision Task , 1994, Brain and Cognition.

[45]  Elvezio Ronchetti,et al.  Small Sample Asymptotics , 1990 .

[46]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[47]  D. Harville Maximum Likelihood Approaches to Variance Component Estimation and to Related Problems , 1977 .

[48]  David E. Tyler,et al.  Redescending $M$-Estimates of Multivariate Location and Scatter , 1991 .

[49]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[50]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[51]  A. Khuri,et al.  A second bibliography on variance components , 1985 .

[52]  R. Little,et al.  Editing and Imputation for Quantitative Survey Data , 1987 .

[53]  Felipe Osorio,et al.  Assessment of local influence in elliptical linear models with longitudinal structure , 2007, Comput. Stat. Data Anal..

[54]  Karl J. Friston,et al.  Variance Components , 2003 .

[55]  Ying Nian Wu,et al.  Efficient Algorithms for Robust Estimation in Linear Mixed-Effects Models Using the Multivariate t Distribution , 2001 .

[56]  David M. Rocke Robustness properties of S-estimators of multivariate location and shape in high dimension , 1996 .

[57]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[58]  R. Jennrich,et al.  Unbalanced repeated-measures models with structured covariance matrices. , 1986, Biometrics.

[59]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[60]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[61]  J. Tukey,et al.  The Fitting of Power Series, Meaning Polynomials, Illustrated on Band-Spectroscopic Data , 1974 .

[62]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[63]  H. D. Patterson,et al.  Recovery of inter-block information when block sizes are unequal , 1971 .

[64]  Kara D. Federmeier,et al.  Event-related brain potentials. , 1990 .

[65]  R. Maronna Robust $M$-Estimators of Multivariate Location and Scatter , 1976 .

[66]  Richard M. Huggins,et al.  Variables Selection using the Wald Test and a Robust Cp , 1996 .

[67]  Alice Richardson,et al.  Bounded Influence Estimation in the Mixed Linear Model , 1997 .

[68]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[69]  P S Gill A robust mixed linear model analysis for longitudinal data. , 2000, Statistics in medicine.

[70]  W. Stahel,et al.  Approaches to robust estimation in the simplest variance components model , 1997 .

[71]  F. Graybill,et al.  Matrices with Applications in Statistics. , 1984 .

[72]  Elvezio Ronchetti,et al.  Robust Linear Model Selection by Cross-Validation , 1997 .

[73]  R. Potthoff,et al.  A generalized multivariate analysis of variance model useful especially for growth curve problems , 1964 .

[74]  David M. Rocke Robust statistical analysis of interlaboratory studies , 1983 .

[75]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[76]  A. Welsh,et al.  ASYMPTOTIC PROPERTIES OF RESTRICTED MAXIMUM LIKELIHOOD (REML) ESTIMATES FOR HIERARCHICAL MIXED LINEAR MODELS , 1994 .

[77]  D A Berry,et al.  Logarithmic transformations in ANOVA. , 1987, Biometrics.

[78]  Elvezio Ronchetti,et al.  Empirical Saddlepoint Approximations for Multivariate M-estimators , 1994 .

[79]  D. Ruppert Computing S Estimators for Regression and Multivariate Location/Dispersion , 1992 .

[80]  R. Huggins,et al.  A Robust Approach to the Analysis of Repeated Measures , 1993 .

[81]  A. Gallant,et al.  Nonlinear Statistical Models , 1988 .

[82]  Asit P. Basu,et al.  Aspects of Statistical Inference , 1996, Technometrics.