Design- and Model-Based Approaches to Small-Area Estimation in A Low- and Middle-Income Country Context: Comparisons and Recommendations

The need for rigorous and timely health and demographic summaries has provided the impetus for an explosion in geographic studies, with a common approach being the production of pixel-level maps, particularly in low and middle income countries. In this context, household surveys are a major source of data, usually with a two-stage cluster design with stratification by region and urbanicity. Accurate estimates are of crucial interest for precision public health policy interventions, but many current studies take a cavalier approach to acknowledging the sampling design, while presenting results at a fine geographic scale. In this paper we investigate the extent to which accounting for sample design can affect predictions at the aggregate level, which is usually the target of inference. We describe a simulation study in which realistic sampling frames are created for Kenya, based on population and demographic information, with a survey design that mimics a Demographic Health Survey (DHS). We compare the predictive performance of various commonly-used models. We also describe a cluster level model with a discrete spatial smoothing prior that has not been previously used, but provides reliable inference. We find that including stratification and cluster level random effects can improve predictive performance. Spatially smoothed direct (weighted) estimates were robust to priors and survey design. Continuous spatial models performed well in the presence of fine scale variation; however, these models require the most "hand holding". Subsequently, we examine how the models perform on real data; specifically we model the prevalence of secondary education for women aged 20-29 using data from the 2014 Kenya DHS.

[1]  Sue Desmond-Hellmann,et al.  Progress lies in precision , 2016, Science.

[2]  Shanta Devarajan Africa's Statistical Tragedy , 2013 .

[3]  Catherine Linard,et al.  Disaggregating Census Data for Population Mapping Using Random Forests with Remotely-Sensed and Ancillary Data , 2015, PloS one.

[4]  Nancy Fullman,et al.  Mapping local variation in educational attainment across Africa , 2018, Nature.

[5]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[6]  Nancy Fullman,et al.  Mapping under-5 and neonatal mortality in Africa, 2000–15: a baseline analysis for the Sustainable Development Goals , 2017, The Lancet.

[7]  Domingo Morales,et al.  Small area estimation with spatio-temporal Fay-Herriot models , 2013, Comput. Stat. Data Anal..

[8]  Andrea Riebler,et al.  An intuitive Bayesian spatial model for disease mapping that accounts for scaling , 2016, Statistical methods in medical research.

[9]  A. Glassman,et al.  The Political Economy of Bad Data: Evidence from African Survey & Administrative Statistics , 2014 .

[10]  Colin Mathers,et al.  Every Newborn: progress, priorities, and potential beyond survival , 2014, The Lancet.

[11]  Peter Congdon,et al.  Estimating Small Area Diabetes Prevalence in the US Using the Behavioral Risk Factor Surveillance System , 2010, Journal of Data Science.

[12]  Andrew J. Tatem,et al.  High resolution age-structured mapping of childhood vaccination coverage in low and middle income countries , 2018, Vaccine.

[13]  Hiroshi Midzuno,et al.  On the sampling system with probability proportionate to sum of sizes , 1951 .

[14]  Jon Wakefield,et al.  The use of sampling weights in Bayesian hierarchical models for small area estimation. , 2014, Spatial and spatio-temporal epidemiology.

[15]  R Carroll,et al.  Spatial small area smoothing models for handling survey data with nonresponse , 2017, Statistics in medicine.

[16]  Jon Wakefield,et al.  Changes in the spatial distribution of the under-five mortality rate: Small-area analysis of 122 DHS surveys in 262 subregions of 35 countries in Africa , 2019, PloS one.

[17]  Warren C. Jochem,et al.  Spatially disaggregated population estimates in the absence of national population and housing census data , 2018, Proceedings of the National Academy of Sciences.

[18]  N Hens,et al.  Model-based inference for small area estimation with sampling weights. , 2016, Spatial statistics.

[19]  Peter J. Diggle,et al.  Geostatistical Methods for Disease Mapping and Visualisation Using Data from Spatio‐temporally Referenced Prevalence Surveys , 2018, International statistical review = Revue internationale de statistique.

[20]  Jon Wakefield,et al.  Space-Time Smoothing of Complex Survey Data: Small Area Estimation for Child Mortality. , 2015, The annals of applied statistics.

[21]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[22]  Peter J. Diggle,et al.  Model-Based Geostatistics for Prevalence Mapping in Low-Resource Settings , 2015, 1505.06891.

[23]  Isabel Molina,et al.  Small Area Estimation: Rao/Small Area Estimation , 2005 .

[24]  David L. Blazes,et al.  Four steps to precision public health , 2016, Nature.

[25]  Noel Cressie,et al.  Spatial fay-herriot models for small area estimation with functional covariates , 2013, 1303.6668.

[26]  Daniel S. Falster,et al.  Corrigendum: The Coral Trait Database, a curated database of trait information for coral species from the global oceans , 2017, Scientific Data.

[27]  David L. Smith,et al.  Mapping child growth failure in Africa between 2000 and 2015 , 2018, Nature.

[28]  W. Riley,et al.  Precision Public Health for the Era of Precision Medicine. , 2016, American journal of preventive medicine.

[29]  Samir Bhatt,et al.  Mapping Plasmodium falciparum Mortality in Africa between 1990 and 2015. , 2016, The New England journal of medicine.

[30]  Thiago G. Martins,et al.  Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors , 2014, 1403.4630.

[31]  J. Deville,et al.  Unequal probability sampling without replacement through a splitting method , 1998 .

[32]  M. Burke,et al.  Armed conflict and child mortality in Africa: a geospatial analysis , 2018, The Lancet.

[33]  R. Fay,et al.  Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data , 1979 .

[34]  Haavard Rue,et al.  Constructing Priors that Penalize the Complexity of Gaussian Random Fields , 2015, Journal of the American Statistical Association.

[35]  Andrew J. Tatem,et al.  WorldPop, open data for spatial demography , 2017, Scientific Data.

[36]  Thomas Lumley,et al.  Analysis of Complex Survey Samples , 2004 .

[37]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .