Small Area Estimation under a Two Part Random Effects Model with Application to Estimation of Literacy in Developing Countries

The UNESCO Institute for Statistics has initiated a programme to collect data on the level of literacy of adults in developing countries. This will involve conducting small-scale surveys in a few countries that will consist of giving interviewees aged 15+ a test to measure their literacy score. One of the main objectives of these surveys is to obtain summary measures of literacy levels in small geographical areas for which only very small samples would be available, thus requiring the use of model based small area estimation methods. Available methods are not suitable, however, for this kind of data due to the mixed distribution of the literacy scores in developing countries. This distribution has a large peak at zero, i.e., a large proportion of adults that are illiterate, and juxtaposed to this peak is an approximately bell-shaped distribution of the non-zero scores measured for the rest of the sample. In this paper we develop a two part three-level model that is suitable for this kind of data and show how to obtain the small area measures and their variances, or compute confidence intervals, based on this model. The proposed method is illustrated using simulated data and data obtained from a similar literacy survey conducted in Cambodia.