Modeling type 1 and type 2 diabetes mellitus incidence in youth: an application of Bayesian hierarchical regression for sparse small area data.

Sparse count data violate assumptions of traditional Poisson models due to the excessive amount of zeros, and modeling sparse data becomes challenging. However, since aggregation to reduce sparseness may result in biased estimates of risk, solutions need to be found at the level of disaggregated data. We investigated different statistical approaches within a Bayesian hierarchical framework for modeling sparse data without aggregation of data. We compared our proposed models with the traditional Poisson model and the zero-inflated model based on simulated data. We applied statistical models to type 1 and type 2 diabetes in youth 10-19 years known as rare diseases, and compared models using the inference results and various model diagnostic tools. We showed that one of the models we proposed, a sparse Poisson convolution model, performed better than other models in the simulation and application based on the deviance information criterion (DIC) and the mean squared prediction error.

[1]  G. Law,et al.  Population mixing and childhood diabetes. , 2001, International journal of epidemiology.

[2]  Dongchu Sun,et al.  A Bivariate Bayes Method for Improving the Estimates of Mortality Rates With a Twofold Conditional Autoregressive Model , 2001 .

[3]  A. Gelfand,et al.  Proper multivariate conditional autoregressive models for spatial data analysis. , 2003, Biostatistics.

[4]  Andrew B. Lawson,et al.  Statistical Methods in Spatial Epidemiology , 2001 .

[5]  S. Piantadosi,et al.  The ecological fallacy. , 1988, American journal of epidemiology.

[6]  C. Robert,et al.  Deviance information criteria for missing data models , 2006 .

[7]  T. Waldhör,et al.  Regional distribution of risk for childhood diabetes in Austria and possible association with body mass index , 2003, European Journal of Pediatrics.

[8]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[9]  D. Freedman Ecological Inference and the Ecological Fallacy , 1999 .

[10]  J Tuomilehto,et al.  The incidence of type 1 diabetes among children in Finland--rural-urban difference. , 2003, Health & place.

[11]  J. Besag,et al.  On conditional and intrinsic autoregressions , 1995 .

[12]  Bradley P Carlin,et al.  Generalized Hierarchical Multivariate CAR Models for Areal Data , 2005, Biometrics.

[13]  Charles E Matthews,et al.  Urban, rural, and regional variations in physical activity. , 2005, The Journal of rural health : official journal of the American Rural Health Association and the National Rural Health Care Association.

[14]  J. Tuomilehto,et al.  Bayesian analysis of geographical variation in the incidence of Type I diabetes in Finland , 2001, Diabetologia.

[15]  F. Alexander,et al.  Small area variation in the incidence of childhood insulin-dependent diabetes mellitus in Yorkshire, UK: links with overcrowding and population density. , 1997, International journal of epidemiology.

[16]  Andrew B. Lawson,et al.  Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology , 2008 .

[17]  E. Schober,et al.  Small area variation in childhood diabetes mellitus in Austria: links to population density, 1989 to 1999. , 2003, Journal of clinical epidemiology.

[18]  Ž. Padaiga,et al.  A fourfold difference in the incidence of type 1 diabetes between Sweden and Lithuania but similar prevalence of autoimmunity. , 2004, Diabetes research and clinical practice.

[19]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[20]  Dana Dabelea,et al.  Incidence of diabetes in youth in the United States. , 2007, JAMA.

[21]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[22]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[23]  Angela Lee,et al.  Perspectives on … Environmental Systems Research Institute, Inc , 1997 .

[24]  N. Waugh,et al.  Urban/rural and deprivational differences in incidence and clustering of childhood diabetes in Scotland. , 1992, International journal of epidemiology.

[25]  C. Cardwell,et al.  Higher incidence of childhood-onset type 1 diabetes mellitus in remote areas: a UK regional small-area analysis , 2006, Diabetologia.

[26]  Y. Cheung,et al.  Zero‐inflated models for regression analysis of count data: a study of growth and development , 2002, Statistics in medicine.

[27]  Jonathan Wakefield,et al.  A critique of statistical aspects of ecological studies in spatial epidemiology , 2004, Environmental and Ecological Statistics.

[28]  Samuel O. M. Manda,et al.  Detecting small-area similarities in the epidemiology of childhood acute lymphoblastic leukemia and diabetes mellitus, type 1: a Bayesian approach. , 2005, American journal of epidemiology.

[29]  Jye-Chyi Lu,et al.  Bayesian analysis of zero-inflated regression models , 2006 .

[30]  J. Robins,et al.  Invited commentary: ecologic studies--biases, misconceptions, and counterexamples. , 1994, American journal of epidemiology.

[31]  Andrew Lawson,et al.  Evaluating geographic variation in type 1 and type 2 diabetes mellitus incidence in youth in four US regions. , 2010, Health & place.

[32]  P. McKinney,et al.  Type 1 diabetes in Yorkshire, UK: time trends in 0–14 and 15–29‐year‐olds, age at onset and age‐period‐cohort modelling , 2003, Diabetic medicine : a journal of the British Diabetic Association.

[33]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .