Combining data from multiple spatially referenced prevalence surveys using generalized linear geostatistical models

type="main" xml:id="rssa12069-abs-0001"> Data from multiple prevalence surveys can provide information on common parameters of interest, which can therefore be estimated more precisely in a joint analysis than by separate analyses of the data from each survey. However, fitting a single model to the combined data from multiple surveys is inadvisable without testing the implicit assumption that all of the surveys are directed at the same inferential target. We propose a multivariate generalized linear geostatistical model that accommodates two sources of heterogeneity across surveys to correct for spatially structured bias in non-randomized surveys and to allow for temporal variation in the underlying prevalence surface between consecutive survey periods. We describe a Monte Carlo maximum likelihood procedure for parameter estimation and show through simulation experiments how accounting for the different sources of heterogeneity among surveys in a joint model leads to more precise inferences. We describe an application to multiple surveys of the prevalence of malaria conducted in Chikhwawa District, Southern Malawi, and discuss how this approach could inform hybrid sampling strategies that combine data from randomized and non-randomized surveys to make the most efficient use of all available data.

[1]  P. Diggle,et al.  Geostatistical inference under preferential sampling , 2010 .

[2]  Marcello Pagano,et al.  Health indicators: Eliminating bias from convenience sampling estimators , 2011, Statistics in medicine.

[3]  C. Moriarity,et al.  Statistical Matching: A Paradigm for Assessing the Uncertainty in the Procedure , 2001 .

[4]  Peter J Diggle,et al.  Validation of the rapid assessment procedure for loiasis (RAPLOA) in the democratic republic of Congo , 2012, Parasites & Vectors.

[5]  Sharon L. Lohr,et al.  Estimation in Multiple-Frame Surveys , 2006 .

[6]  Nathaniel Schenker,et al.  Combining Information From Two Surveys to Estimate County-Level Prevalence Rates of Cancer Risk Factors and Screening , 2007 .

[7]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[8]  Hao Zhang On Estimation and Prediction for Spatial Generalized Linear Mixed Models , 2002, Biometrics.

[9]  Michael R. Elliott,et al.  Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys , 2005 .

[10]  Dianne J Terlouw,et al.  Rolling Malaria Indicator Surveys (rMIS): a potential district-level malaria monitoring and evaluation (M&E) tool for program managers. , 2012, The American journal of tropical medicine and hygiene.

[11]  C. Karema,et al.  Prevalence and risk factors of malaria among children in southern highland Rwanda , 2011, Malaria Journal.

[12]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[13]  C. Geyer On the Convergence of Monte Carlo Maximum Likelihood Calculations , 1994 .

[14]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[15]  David J Spiegelhalter,et al.  Bias modelling in evidence synthesis , 2009, Journal of the Royal Statistical Society. Series A,.

[16]  Gareth O. Roberts,et al.  Robust Markov chain Monte Carlo Methods for Spatial Generalized Linear Mixed Models , 2006 .

[17]  R. Waagepetersen,et al.  Bayesian Prediction of Spatial Count Data Using Generalized Linear Mixed Models , 2002, Biometrics.

[18]  Charles J. Geyer,et al.  Estimation and Optimization of Functions , 1996 .

[19]  Giancarlo Manzi,et al.  Modelling bias in combining small area prevalence estimates from multiple surveys , 2011, Journal of the Royal Statistical Society. Series A,.

[20]  Charles J. Geyer,et al.  Likelihood inference for spatial point processes , 2019, Stochastic Geometry.

[21]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[22]  Lilani Kumaranayake,et al.  Constructing socio-economic status indices: how to use principal components analysis. , 2006, Health policy and planning.

[23]  P. Diggle,et al.  Bivariate Binomial Spatial Modeling of Loa loa Prevalence in Tropical Africa , 2008 .

[24]  O. F. Christensen Monte Carlo Maximum Likelihood in Model-Based Geostatistics , 2004 .