Segmented regression with errors in predictors: semi-parametric and parametric methods.

We consider the estimation of parameters in a particular segmented generalized linear model with additive measurement error in predictors, with a focus on linear and logistic regression. In epidemiologic studies segmented regression models often occur as threshold models, where it is assumed that the exposure has no influence on the response up to a possibly unknown threshold. Furthermore, in occupational and environmental studies the exposure typically cannot be measured exactly. Ignoring this measurement error leads to asymptotically biased estimators of the threshold. It is shown that this asymptotic bias is different from that observed for estimating standard generalized linear model parameters in the presence of measurement error, being both larger and in different directions than expected. In most cases considered the threshold is asymptotically underestimated. Two standard general methods for correcting for this bias are considered; regression calibration and simulation extrapolation (simex). In ordinary logistic and linear regression these procedures behave similarly, but in the threshold segmented regression model they operate quite differently. The regression calibration estimator usually has more bias but less variance than the simex estimator. Regression calibration and simex are typically thought of as functional methods, also known as semi-parametric methods, because they make no assumptions about the distribution of the unobservable covariate X. The contrasting structural, parametric maximum likelihood estimate assumes a parametric distributional form for X. In ordinary linear regression there is typically little difference between structural and functional methods. One of the major, surprising findings of our study is that in threshold regression, the functional and structural methods differ substantially in their performance. In one of our simulations, approximately consistent functional estimates can be as much as 25 times more variable than the maximum likelihood estimate for a properly specified parametric model. Structural (parametric) modelling ought not be a neglected tool in measurement error models. An example involving dust concentration and bronchitis in a mechanical engineering plant in Munich is used to illustrate the results.