A New Method for Dealing with Measurement Error in Explanatory Variables of Regression Models

We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables.

[1]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[2]  Sudhir Gupta,et al.  Statistical Regression With Measurement Error , 1999, Technometrics.

[3]  Yijian Huang,et al.  Consistent Functional Methods for Logistic Regression With Errors in Covariates , 2001 .

[4]  A. Henderson,et al.  Biological variance of total lactate dehydrogenase and its isoenzymes in human serum. , 1984, Clinical chemistry.

[5]  D. W. Schafer Likelihood analysis for probit regression with measurement errors , 1993 .

[6]  Raymond J. Carroll,et al.  Approximate Quasi-likelihood Estimation in Models with Surrogate Predictors , 1990 .

[7]  S. J. Press,et al.  Choosing between Logistic Regression and Discriminant Analysis , 1978 .

[8]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[9]  J. Freudenheim,et al.  The problem of profound mismeasurement and the power of epidemiological studies of diet and cancer. , 1988, Nutrition and cancer.

[10]  Raymond J. Carroll,et al.  Conditional scores and optimal scores for generalized linear measurement-error models , 1987 .

[11]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[12]  C. Junien,et al.  Detection of carriers for duchenne muscular dystrophy. Quality control of creatine kinase assay , 1982, Human Genetics.

[13]  R. Carroll,et al.  Prospective Analysis of Logistic Case-Control Studies , 1995 .

[14]  K. Liang,et al.  Locally Ancillary Quasi-Score Models for Errors-in-Covariates , 2001 .

[15]  A. Dyer,et al.  Statistical methods to assess and minimize the role of intra-individual variability in obscuring the relationship between dietary lipids and serum cholesterol. , 1978, Journal of chronic diseases.