Some findings on zero‐inflated and hurdle poisson models for disease mapping

Zero excess in the study of geographically referenced mortality data sets has been the focus of considerable attention in the literature, with zero-inflation being the most common procedure to handle this lack of fit. Although hurdle models have also been used in disease mapping studies, their use is more rare. We show in this paper that models using particular treatments of zero excesses are often required for achieving appropriate fits in regular mortality studies since, otherwise, geographical units with low expected counts are oversmoothed. However, as also shown, an indiscriminate treatment of zero excess may be unnecessary and has a problematic implementation. In this regard, we find that naive zero-inflation and hurdle models, without an explicit modeling of the probabilities of zeroes, do not fix zero excesses problems well enough and are clearly unsatisfactory. Results sharply suggest the need for an explicit modeling of the probabilities that should vary across areal units. Unfortunately, these more flexible modeling strategies can easily lead to improper posterior distributions as we prove in several theoretical results. Those procedures have been repeatedly used in the disease mapping literature, and one should bear these issues in mind in order to propose valid models. We finally propose several valid modeling alternatives according to the results mentioned that are suitable for fitting zero excesses. We show that those proposals fix zero excesses problems and correct the mentioned oversmoothing of risks in low populated units depicting geographic patterns more suited to the data.

[1]  Alan E. Gelfand,et al.  Zero-inflated models with application to spatial count data , 2002, Environmental and Ecological Statistics.

[2]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[3]  A. Albert,et al.  On the existence of maximum likelihood estimates in logistic regression models , 1984 .

[4]  Irene Lena Hudson,et al.  Finite Mixture, Zero-inflated Poisson and Hurdle models with application to SIDS , 2003, Comput. Stat. Data Anal..

[5]  D. Bandyopadhyay,et al.  A Zero-Inflated Spatial Gamma Process Model With Applications to Disease Mapping , 2013 .

[6]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[7]  David C. Heilbron,et al.  Zero-Altered and other Regression Models for Count Data with Added Zeros , 1994 .

[8]  J. van den Broek,et al.  A score test for zero inflation in a Poisson distribution. , 1995 .

[9]  Brian Neelon,et al.  Spatiotemporal hurdle models for zero-inflated count data: Exploring trends in emergency department visits , 2016, Statistical methods in medical research.

[10]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[11]  Diane Lambert,et al.  Zero-inflacted Poisson regression, with an application to defects in manufacturing , 1992 .

[12]  Ali Arab,et al.  Spatial and Spatio-Temporal Models for Modeling Epidemiological Data with Excess Zeros , 2015, International journal of environmental research and public health.

[13]  Ana F. Militino,et al.  Testing for Poisson Zero Inflation in Disease Mapping , 2004 .

[14]  A James O'Malley,et al.  A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use , 2010, Statistical modelling.

[15]  Penelope Vounatsou,et al.  Bayesian analysis of zero inflated spatiotemporal HIV/TB child mortality data through the INLA and SPDE approaches: Applied to data observed between 1992 and 2010 in rural North East South Africa , 2013, Int. J. Appl. Earth Obs. Geoinformation.

[16]  Andrew Lawson,et al.  Modeling type 1 and type 2 diabetes mellitus incidence in youth: an application of Bayesian hierarchical regression for sparse small area data. , 2011, Spatial and spatio-temporal epidemiology.

[17]  J. Mullahy Specification and testing of some modified count data models , 1986 .

[18]  C. McCulloch,et al.  A Note on the Existence of the Posterior Distribution for a Class of Mixed Models for Binomial Responses , 1995 .

[19]  M. J. Bayarri,et al.  Objective Bayes testing of Poisson versus inflated Poisson models , 2008, 0805.3220.

[20]  C. Czado,et al.  Modelling count data with overdispersion and spatial effects , 2008 .

[21]  Pulak Ghosh,et al.  A spatial Poisson hurdle model for exploring geographic variation in emergency department visits , 2013, Journal of the Royal Statistical Society. Series A,.

[22]  J. Berger The case for objective Bayesian analysis , 2006 .