Estimates for small area compositions subjected to informative missing data

Estimation of small area (or domain) compositions may suffer from informative missing data, if the probability of missing varies across the categories of interest as well as the small areas. We develop a double mixed modeling approach that combines a random effects mixed model for the underlying complete data with a random effects mixed model of the differential missing-data mechanism. The effect of sampling design can be incorporated through a quasi-likelihood sampling model. The associated conditional mean squared error of prediction is approximated in terms of a three-part decomposition, corresponding to a naive prediction variance, a positive correction that accounts for the hypothetical parameter estimation uncertainty based on the latent complete data, and another positive correction for the extra variation due to the missing data. We illustrate our approach with an application to the estimation of Municipality household compositions based on the Norwegian register household data, which suffer from informative under-registration of the dwelling identity number.

[1]  J. Booth,et al.  Standard Errors of Prediction in Generalized Linear Mixed Models , 1998 .

[2]  Noel J. Purcell,et al.  Postcensal Estimates for Local Areas (or Domains) , 1980 .

[3]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[4]  Nicholas T. Longford,et al.  Multivariate shrinkage estimation of small area means and proportions , 1999 .

[5]  Bradley P. Carlin,et al.  Generalized Linear Models for Small-Area Estimation , 1998 .

[6]  R. Schall Estimation in generalized linear models with random effects , 1991 .

[7]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[8]  N. G. N. Prasad,et al.  The estimation of mean-squared errors of small-area estimators , 1990 .

[9]  Estimation in Generalized Linear Models with Heterogeneous Random Effects , 2004 .

[10]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[11]  R. Fay,et al.  Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data , 1979 .

[12]  J. Rao,et al.  The estimation of the mean squared error of small-area estimators , 1990 .

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Robert Chambers,et al.  Small area estimates for cross‐classifications , 2004 .

[15]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[16]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.