Environmental data contains lengthy records of sequential missing values. Practical problem arose in the analysis of adverse health effects of sulphur dioxide (SO2) levels and asthma hospital admissions for Sydney, Nova Scotia, Canada. Reliable missing values imputation techniques are required to obtain valid estimates of the associations with sparse health outcomes such as asthma hospital admissions. In this paper, a new method that incorporates prediction errors to impute missing values is described using mean daily average sulphur dioxide levels following a stationary time series with a random error. Existing imputation methods failed to incorporate the prediction errors. An optimal method is developed by extending a between forecast method to include prediction errors. Validity and efficacy are demonstrated comparing the performances with the values that do not include prediction errors. The performances of the optimal method are demonstrated by increased validity and accuracy of the β coefficient of the Poisson regression model for the association with asthma hospital admissions. Visual inspection of the imputed values of sulphur dioxide levels with prediction errors demonstrated that the variation is better captured. The method is computationally simple and can be incorporated into the existing statistical software. Copyright © 2009 John Wiley & Sons, Ltd.
[1]
Andrew Harvey,et al.
Maximum likelihood estimation of regression models with autoregressive-moving average disturbances
,
1979
.
[2]
George E. P. Box,et al.
Time Series Analysis: Forecasting and Control
,
1977
.
[3]
R. Shumway,et al.
AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM
,
1982
.
[4]
D. Rubin,et al.
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
,
1977
.
[5]
D B Rubin,et al.
Multiple Imputation for Multivariate Data with Missing and Below‐Threshold Measurements: Time‐Series Concentrations of Pollutants in the Arctic
,
2001,
Biometrics.
[6]
Mohsen Pourahmadi.
ESTIMATION AND INTERPOLATION OF MISSING VALUES OF A STATIONARY TIME SERIES
,
1989
.