Likelihood analysis for errors-in-variables regression with replicate measurements

SUMMARY This paper advocates likelihood analysis for regression models with measurement errors in explanatory variables, for data problems in which the relevant distributions can be adequately modelled. Although computationally difficult, maximum likelihood estimates are more efficient than those based on first and second moment assumptions, and likelihood ratio inferences can be substantially better than those based on asymptotic normality of estimates. The EM algorithm is presented as a straightforward approach for likelihood analysis of normal linear regression with normal explanatory variables, and normal replicate measurements. Likelihood methods for regression analysis with explanatory variable measurement error have received little attention relative to methods based on first and second moment assumptions, largely because of computational difficulties, uncertainty about the robustness of likelihood methods, and the belief that the simpler methods perform just as well in practice. With improved computational tools for likelihood analysis, as discussed, for example, by Tanner (1993), it is worthwhile to reinvestigate these issues. In particular, the common belief that the simpler methods perform just as well as likelihood methods may be naive. First, the comparison is usually cast in terms of efficiency, without regard to test and confidence interval validity. Likelihood ratio tests and confidence intervals can be substantially better than tests and confidence intervals based on estimates and standard errors of the commonly-used method of moments approaches. The sampling distributions of the estimators are very often skewed, especially if the measurement errors are large. Furthermore, knowledge about the relative efficiency of the moment methods comes from a small number of efficiency comparisons from simulations of isolated situations, in papers promoting these simpler methods (Carroll, Rupert & Stefanski, 1995, ? 7.1). An important and often overlooked issue is that the moment methods may be especially inappropriate when the distribution of the true covariates is skewed (Pierce et al., 1992). Section 2 discusses computational approaches and difficulties associated with likelihood analysis in the measurement error problem generally. Section 3 illustrates the EM algorithm (Dempster, Laird & Rubin, 1977) for likelihood analysis of normal linear regression with a normally-distributed explanatory variable measured with normally-distributed error, when there are replicate measurements for at least some of the cases. Sections 5 and 6