Improving national level spatial mapping of malaria through alternative spatial and spatio-temporal models.

Abstract The most common approach to create spatial prediction of malaria in the literature is to approximate a Gaussian process model using stochastic partial differential equation (SPDE). We compared SPDE to computationally faster alternatives, generalized additive model (GAM) and state-of-the-art machine learning method gradient boosted trees (GBM), with respect to their predictive skill for country-level malaria prevalence mapping. We also evaluated the intuition that incorporation of past data and the use of spatio-temporal models may improve predictive accuracy of present spatial distribution of malaria. Model performances varied among the countries and setting with SPDE and GAM performed well generally. The inclusion of past data is beneficial for GAM and GBM, but not for SPDE. We further investigated the weaknesses of SPDE at spatio-temporal setting and GAM at the edges of the countries. Taken together, we believe that spatial/spatio-temporal SPDE models should be evaluated alongside with the alternatives or at least GAM.

[1]  Haavard Rue,et al.  Spatial modelling with R-INLA: A review , 2018, 1802.06350.

[2]  P. Gething,et al.  Re-examining environmental correlates of Plasmodium falciparum malaria endemicity: a data-intensive variable selection approach , 2015, Malaria Journal.

[3]  Su Yun Kang,et al.  Mapping the global prevalence, incidence, and mortality of Plasmodium falciparum, 2000–17: a spatial and temporal modelling study , 2019, The Lancet.

[4]  Julius Nyerere Odhiambo,et al.  Spatial and spatio-temporal methods for mapping malaria risk: a systematic review , 2020, BMJ Global Health.

[5]  S. Wood Generalized Additive Models: An Introduction with R , 2006 .

[6]  P. Psychas,et al.  Characterizing local-scale heterogeneity of malaria risk: a case study in Bunkpurugu-Yunyoo district in northern Ghana , 2019, Malaria Journal.

[7]  J. Gaudart,et al.  Geo-Epidemiology of Malaria at the Health Area Level, Dire Health District, Mali, 2013–2017 , 2020, International journal of environmental research and public health.

[8]  David L. Smith,et al.  A World Malaria Map: Plasmodium falciparum Endemicity in 2007 , 2009, PLoS medicine.

[9]  P. Leitão,et al.  Assessment of land use factors associated with dengue cases in Malaysia using Boosted Regression Trees. , 2014, Spatial and spatio-temporal epidemiology.

[10]  S. Wood Thin plate regression splines , 2003 .

[11]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[12]  L Gosoniu,et al.  Bayesian modelling of geostatistical malaria risk data. , 2006, Geospatial health.

[13]  O. Gaye,et al.  Estimating the Burden of Malaria in Senegal: Bayesian Zero-Inflated Binomial Geostatistical Modeling of the MIS 2008 Data , 2012, PloS one.

[14]  P. Vounatsou,et al.  Malaria risk in Nigeria: Bayesian geostatistical modelling of 2010 malaria indicator survey data , 2015, Malaria Journal.

[15]  Simon N Wood,et al.  Just Another Gibbs Additive Modeler: Interfacing JAGS and mgcv , 2016, 1602.02539.

[16]  Forrest R. Stevens,et al.  Gridded Population Maps Informed by Different Built Settlement Products , 2018, Data.

[17]  Catherine Linard,et al.  The impact of urbanization and population density on childhood Plasmodium falciparum parasite prevalence rates in Africa , 2017, Malaria Journal.

[18]  U. Dalrymple,et al.  The effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 , 2015, Nature.

[19]  P Vounatsou,et al.  Malaria mapping using transmission models: application to survey data from Mali. , 2006, American journal of epidemiology.

[20]  V. Gómez‐Rubio Bayesian Inference with INLA , 2020 .

[21]  R. Cibulskis,et al.  World Malaria Report 2013 , 2014 .

[22]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[23]  Dirk U. Pfeiffer,et al.  Spatial modelling of disease using data- and knowledge-driven approaches. , 2011, Spatial and spatio-temporal epidemiology.

[24]  S. Hay,et al.  The current and future global distribution and population at risk of dengue , 2019, Nature Microbiology.

[25]  Steve M. Taylor,et al.  Population, behavioural and environmental drivers of malaria prevalence in the Democratic Republic of Congo , 2011, Malaria Journal.

[26]  David L. Smith,et al.  Mapping the global endemicity and clinical burden of Plasmodium vivax, 2000–17: a spatial and temporal modelling study , 2019, The Lancet.

[27]  Anthony N. Pettitt,et al.  Comment on the paper: ‘Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations’ by Rue, H. Martino, S. and Chopin, N. , 2009 .

[28]  O. Dubrule Two methods with different objectives: Splines and kriging , 1983 .

[29]  L. Kazembe,et al.  Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data , 2014, PloS one.

[30]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[31]  J. Gaudart,et al.  Spatio-temporal dynamic of malaria in Ouagadougou, Burkina Faso, 2011–2015 , 2018, Malaria Journal.

[32]  P. Vounatsou,et al.  Geostatistical modelling of malaria indicator survey data to assess the effects of interventions on the geographical distribution of malaria prevalence in children less than 5 years in Uganda , 2017, PloS one.

[33]  Thomas A. Smith,et al.  Mapping malaria risk in West Africa using a Bayesian nonparametric non-stationary model , 2009, Comput. Stat. Data Anal..

[34]  I. Kleinschmidt,et al.  Malaria Risk Factors in North West Tanzania: The Effect of Spraying, Nets and Wealth , 2013, PloS one.

[35]  Finn Lindgren,et al.  Bayesian Spatial Modelling with R-INLA , 2015 .

[36]  Jonas Franke,et al.  Geostatistical modelling of the malaria risk in Mozambique: effect of the spatial resolution when using remotely-sensed imagery. , 2015, Geospatial health.

[37]  Seth R Flaxman,et al.  Improved prediction accuracy for disease risk mapping using Gaussian process stacked generalization , 2016, Journal of The Royal Society Interface.

[38]  Andrew B. Lawson,et al.  Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology , 2008 .

[39]  Justin Millar,et al.  Detecting local risk factors for residual malaria in northern Ghana using Bayesian model averaging , 2018, Malaria Journal.

[40]  P. Vounatsou,et al.  Bayesian Geostatistical Modeling of Malaria Indicator Survey Data in Angola , 2010, PloS one.

[41]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[42]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[43]  Haavard Rue,et al.  Bayesian Computing with INLA: A Review , 2016, 1604.00860.

[44]  Joshua L. Warren,et al.  Influence of Demographic and Health Survey Point Displacements on Raster-Based Analyses , 2015, Spatial Demography.

[45]  Finn Lindgren,et al.  Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA , 2018 .

[46]  K. Battle,et al.  A global map of travel time to cities to assess inequalities in accessibility in 2015 , 2018, Nature.

[47]  Mevin B Hooten,et al.  The basis function approach for modeling autocorrelation in ecological data. , 2016, Ecology.

[48]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[49]  J. Michaelsen,et al.  The climate hazards infrared precipitation with stations—a new environmental record for monitoring extremes , 2015, Scientific Data.

[50]  John S. Brownstein,et al.  The global distribution and burden of dengue , 2013, Nature.