Exploring the high-resolution mapping of gender-disaggregated development indicators

Improved understanding of geographical variation and inequity in health status, wealth and access to resources within countries is increasingly being recognized as central to meeting development goals. Development and health indicators assessed at national or subnational scale can often conceal important inequities, with the rural poor often least well represented. The ability to target limited resources is fundamental, especially in an international context where funding for health and development comes under pressure. This has recently prompted the exploration of the potential of spatial interpolation methods based on geolocated clusters from national household survey data for the high-resolution mapping of features such as population age structures, vaccination coverage and access to sanitation. It remains unclear, however, how predictable these different factors are across different settings, variables and between demographic groups. Here we test the accuracy of spatial interpolation methods in producing gender-disaggregated high-resolution maps of the rates of literacy, stunting and the use of modern contraceptive methods from a combination of geolocated demographic and health surveys cluster data and geospatial covariates. Bayesian geostatistical and machine learning modelling methods were tested across four low-income countries and varying gridded environmental and socio-economic covariate datasets to build 1×1 km spatial resolution maps with uncertainty estimates. Results show the potential of the approach in producing high-resolution maps of key gender-disaggregated socio-economic indicators, with explained variance through cross-validation being as high as 74–75% for female literacy in Nigeria and Kenya, and in the 50–70% range for many other variables. However, substantial variations by both country and variable were seen, with many variables showing poor mapping accuracies in the range of 2–30% explained variance using both geostatistical and machine learning approaches. The analyses offer a robust basis for the construction of timely maps with levels of detail that support geographically stratified decision-making and the monitoring of progress towards development goals. However, the great variability in results between countries and variables highlights the challenges in applying these interpolation methods universally across multiple countries, and the importance of validation and quantifying uncertainty if this is undertaken.

[1]  Claudio Bosco,et al.  Estimating the effects of water-induced shallow landslides on soil erosion , 2014, bioRxiv.

[2]  Ana González Marcos,et al.  AMORE: A MORE flexible neural network package , 2014 .

[3]  Clara R. Burgert Spatial interpolation with Demographic and Health Survey data: Key considerations , 2014 .

[4]  Andrew Gelman,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2006 .

[5]  J. Lanjouw,et al.  Micro-Level Estimation of Welfare , 2002 .

[6]  A. Tatem,et al.  Assessing the accuracy of satellite derived global and national urban maps in Kenya. , 2005, Remote sensing of environment.

[7]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[8]  Peter Congdon Bayesian statistical modelling , 2002 .

[9]  Sarah Giroux Child Stunting Across Schooling and Fertility Transitions: Evidence from Sub-Saharan Africa , 2008 .

[10]  Marc A. Levy,et al.  Child hunger in the developing world: An analysis of environmental and social correlates , 2005 .

[11]  Giovanni Caudullo Applying Geospatial Semantic Array Programming for a Reproducible Set of Bioclimatic Indices in Europe , 2014, ArXiv.

[12]  Malay Ghosh,et al.  Small Area Estimation: An Appraisal , 1994 .

[13]  M. H. Quenouille Problems in Plane Sampling , 1949 .

[14]  Andrew J. Tatem,et al.  Reduced vaccination and the risk of measles and other childhood infections post-Ebola , 2015, Science.

[15]  Stephen Taylor Extreme Terseness: Some Languages Are More Agile than Others , 2003, XP.

[16]  The Jackknife Interval Estimation of Parametersin Partial Least Squares Regression Modelfor Poverty Data Analysis , 2010 .

[17]  Luc Girardin,et al.  Integrating Data on Ethnicity , Geography , and Conflict : The Ethnic Power Relations Dataset Family 1 , 2015 .

[18]  Damien Sulla-Menashe,et al.  MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets , 2010 .

[19]  John W. Eaton,et al.  GNU Octave manual version 3: a high-level interactive language for numerical computations , 2008 .

[20]  A. B. Youssef,et al.  Effects of urbanization on economic growth and human capital formation in Africa , 2014 .

[21]  Stephen J. Ganocy,et al.  Bayesian Statistical Modelling , 2002, Technometrics.

[22]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[23]  J. L. Parra,et al.  Very high resolution interpolated climate surfaces for global land areas , 2005 .

[24]  Catherine Linard,et al.  Mapping populations at risk: improving spatial demographic data for infectious disease modeling and metric derivation , 2012, Population Health Metrics.

[25]  J. Lanjouw,et al.  Micro-Level Estimation of Poverty and Inequality , 2003 .

[26]  Vladik Kreinovich,et al.  Arbitrary nonlinearity is sufficient to represent all functions by neural networks: A theorem , 1991, Neural Networks.

[27]  Peter Rogerson,et al.  Statistical methods for geography , 2001 .

[28]  R. Jackson,et al.  Ethnic difference in the relationship between acute inflammation and serum ferritin in US adult males , 2007, Epidemiology and Infection.

[29]  Robert M. Wasson : An Appraisal , 2007 .

[30]  H. Rue,et al.  Spatio-temporal modeling of particulate matter concentration through the SPDE approach , 2012, AStA Advances in Statistical Analysis.

[31]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[32]  Paul A Murtaugh,et al.  Performance of several variable-selection methods applied to real ecological data. , 2009, Ecology letters.

[33]  Mohsen Nasseri,et al.  Evaluation of Stiffened End-Plate Moment Connection through Optimized Artificial Neural Network , 2012 .

[34]  Kenneth E. Iverson,et al.  Notation as a tool of thought , 1980, APLQ.

[35]  Clara R Burgert-Brucker,et al.  Guidance for use of The DHS Program modeled map surfaces , 2016 .

[36]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[37]  Daniele de Rigo,et al.  Semantic Array Programming with Mastrave - Introduction to Semantic Computational Modelling , 2012 .

[38]  Blake Zachary,et al.  Geographic displacement procedure and georeferenced data release policy for the Demographic and Health Surveys. , 2013 .

[39]  C. Funk,et al.  Child malnutrition and climate in Sub-Saharan Africa: An analysis of recent trends in Kenya , 2012 .

[40]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[41]  M. Cayemittes Enquête mortalité, morbidité et utilisation des services EMMUS-IV, Haïti 2005-2006 : rapport préliminaire , 2006 .

[42]  B. Pradhan,et al.  Landslide risk analysis using artificial neural network model focussing on different training sites. , 2009 .

[43]  Margaret Grosh,et al.  A guide to living standards measurement study surveys and their data sets , 1995 .

[44]  Alessandro Sorichetta,et al.  Poverty, health and satellite-derived vegetation indices: their inter-spatial relationship in West Africa. , 2015, International health.

[45]  Isabel Molina,et al.  Small Area Estimation: Rao/Small Area Estimation , 2005 .

[46]  I. Diamond,et al.  Use of Family Planning in Lesotho: The Importance of Quality of Care and Access , 2003 .

[47]  Andrew J. Tatem,et al.  Creating spatial interpolation surfaces with DHS data , 2015 .

[48]  Kenya.,et al.  Kenya Demographic and Health Survey 2008-09 , 2004 .

[49]  Carla Pezzulo,et al.  Development of High-Resolution Gridded Poverty Surfaces , 2014 .

[50]  Daniele de Rigo,et al.  Semantic Array Programming for Environmental Modelling: Application of the Mastrave Library , 2012 .

[51]  A. Tatem,et al.  Defining approaches to settlement mapping for public health management in Kenya using medium spatial resolution satellite imagery. , 2004, Remote sensing of environment.

[52]  C. Willmott,et al.  Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance , 2005 .

[53]  John R. Anderson,et al.  MACHINE LEARNING An Artificial Intelligence Approach , 2009 .

[54]  Daniele de Rigo,et al.  Multi-scale Robust Modelling of Landslide Susceptibility: Regional Rapid Assessment and Catchment Robust Fuzzy Ensemble , 2013, ISESS.

[55]  A. Noor,et al.  Assessing comorbidity and correlates of wasting and stunting among children in Somalia using cross-sectional household surveys: 2007 to 2010 , 2016, BMJ Open.

[56]  T. Esch,et al.  The vision of mapping the global urban footprint using the TerraSAR-X and TanDEM-X mission. , 2012 .

[57]  Daniele de Rigo Study of a collaborative repository of semantic metadata and models for regional environmental datasets\' multivariate transformations , 2015 .

[58]  J. N. K. Rao,et al.  Some recent advances in model-based small area estimation , 1999 .

[59]  U. Dalrymple,et al.  The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 , 2015, Nature.

[60]  S. James Press Hierarchical Bayesian Modeling , 2010 .

[61]  Nicola Secomandi,et al.  Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands , 2000, Comput. Oper. Res..

[62]  Petrina Uusiku,et al.  Advances in mapping malaria for elimination: fine resolution modelling of Plasmodium falciparum incidence , 2016, Scientific Reports.

[63]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[64]  Daniele de Rigo,et al.  Neuro-dynamic programming for the efficient management of reservoir networks , 2001 .

[65]  M. Cameletti,et al.  Spatial and Spatio-temporal Bayesian Models with R - INLA , 2015 .

[66]  Saro Lee,et al.  Landslide susceptibility analysis and its verification using likelihood ratio, logistic regression, and artificial neural network models: case study of Youngin, Korea , 2007 .

[67]  Joshua L. Warren,et al.  Guidelines on the use of DHS GPS data , 2013 .

[68]  Nadejda M. Victor,et al.  The G-Econ Database on Gridded Output: , 2006 .

[69]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[70]  J. Rao Small Area Estimation , 2003 .

[71]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[72]  Tomas J. Bird,et al.  Fine resolution mapping of population age-structures for health and development applications , 2015, Journal of The Royal Society Interface.

[73]  A. Sherbinin The biophysical and geographical correlates of child malnutrition in Africa , 2011 .

[74]  N. Madise,et al.  Contextual influences on modern contraceptive use in sub-Saharan Africa. , 2007, American journal of public health.