High-resolution population estimation using household survey data and building footprints

The national census is an essential data source to support decision-making in many areas of public interest. However, this data may become outdated during the intercensal period, which can stretch up to several decades. We developed a Bayesian hierarchical model leveraging recent household surveys with probabilistic sampling designs and building footprints to produce up-to-date population estimates. We estimated population totals and age and sex breakdowns with associated uncertainty measures within grid cells of approximately 100m in five provinces of the Democratic Republic of the Congo, a country where the last census was completed in 1984. The model exhibited a very good fit, with an R^2 value of 0.79 for out-of-sample predictions of population totals at the microcensus-cluster level and 1.00 for age and sex proportions at the province level. The results confirm the benefits of combining household surveys and building footprints for high-resolution population estimation in countries with outdated censuses.

[1]  A. Tatem,et al.  Sub-national mapping of population pyramids and dependency ratios in Africa and Asia , 2017, Scientific Data.

[2]  C. Buerkle,et al.  Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data , 2019, bioRxiv.

[3]  Andrew J. Tatem,et al.  Using GIS and Machine Learning to Classify Residential Status of Urban Buildings in Low and Middle Income Settings , 2020, Remote. Sens..

[4]  Warren C. Jochem,et al.  Spatially disaggregated population estimates in the absence of national population and housing census data , 2018, Proceedings of the National Academy of Sciences.

[5]  A. Tatem,et al.  Global spatio-temporally harmonised datasets for producing high-resolution gridded population distribution datasets , 2019, Big earth data.

[6]  Warren C. Jochem,et al.  National population mapping from sparse survey data: A hierarchical Bayesian modeling framework to account for uncertainty , 2020, Proceedings of the National Academy of Sciences.

[7]  Budhendra L. Bhaduri,et al.  Census-independent population mapping in northern Nigeria☆ , 2018, Remote sensing of environment.

[8]  Wim Marivoet,et al.  Tracing Down Real Socio-Economic Trends From Household Data With Erratic Sampling Frames: The Case of the Democratic Republic of the Congo , 2018 .

[9]  Warren C. Jochem,et al.  Classifying settlement types from multi-scale spatial patterns of building footprints , 2020, Environment and Planning B: Urban Analytics and City Science.

[10]  Andrew Gelman,et al.  Struggles with survey weighting and regression modeling , 2007, 0710.5005.

[11]  R. Engstrom,et al.  Estimating Small Area Population Density Using Survey Data and Satellite Imagery: An Application to Sri Lanka , 2019 .

[12]  Io Blair-Freese,et al.  Geo-Referenced Infrastructure and Demographic Data for Development , 2019, GHTC.

[13]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[14]  Matthew J. Denwood,et al.  runjags: An R Package Providing Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS , 2016 .

[15]  Forrest R. Stevens,et al.  Allocating people to pixels: A review of large-scale gridded population data products and their fitness for use , 2019 .

[16]  Robert J. Hijmans,et al.  Geographic Data Analysis and Modeling , 2015 .

[17]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[18]  Frank Canters,et al.  Mapping Population Distribution from High Resolution Remotely Sensed Imagery in a Data Poor Setting , 2018, Remote. Sens..

[19]  From figures to facts: making sense of socio-economic surveys in the Democratic Republic of the Congo (DRC) , 2017 .

[20]  A. Tatem,et al.  Gridded population survey sampling: a systematic scoping review of the field and strategic research agenda , 2020, International Journal of Health Geographics.

[21]  Edzer Pebesma,et al.  Simple Features for R: Standardized Support for Spatial Vector Data , 2018, R J..

[22]  Andrew J. Tatem,et al.  A grid-based sample design framework for household surveys , 2020, Gates open research.

[23]  Robert B. Potter,et al.  Doing Development Research , 2006 .