Beta-negative binomial nonlinear spatio-temporal random effects modeling of COVID-19 case counts in Japan

Coronavirus disease 2019 (COVID-19) caused by the SARS-CoV-2 virus has spread seriously throughout the world. Predicting the spread, or the number of cases, in the future can facilitate preparation for, and prevention of, a worst-case scenario. To achieve these purposes, statistical modeling using past data is one feasible approach. This paper describes spatio-temporal modeling of COVID-19 case counts in 47 prefectures of Japan using a nonlinear random effects model, where random effects are introduced to capture the heterogeneity of a number of model parameters associated with the prefectures. The negative binomial distribution is frequently used with the Paul-Held random effects model to account for overdispersion in count data; however, the negative binomial distribution is known to be incapable of accommodating extreme observations such as those found in the COVID-19 case count data. We therefore propose use of the beta-negative binomial distribution with the Paul-Held model. This distribution is a generalization of the negative binomial distribution that has attracted much attention in recent years because it can model extreme observations with analytical tractability. The proposed beta-negative binomial model was applied to multivariate count time series data of COVID-19 cases in the 47 prefectures of Japan. Evaluation by one-step-ahead prediction showed that the proposed model can accommodate extreme observations without sacrificing predictive performance.

[1]  Kenji Karako,et al.  Overview of the characteristics of and responses to the three waves of COVID-19 in Japan during 2020-2021. , 2021, Bioscience trends.

[2]  Junichiro Niimi,et al.  Public perceptions, individual characteristics, and preventive behaviors for COVID-19 in six countries: a cross-sectional study , 2020, Environmental Health and Preventive Medicine.

[3]  T. Atsumi,et al.  COVID-19 pandemic in Japan , 2020, Rheumatology International.

[4]  P. Gorgi Beta–negative binomial auto‐regressions for modelling integer‐valued time series with extreme observations , 2020, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[5]  G. Espa,et al.  Assessing the effect of containment measures on the spatio-temporal dynamic of COVID-19 in Italy , 2020, Nonlinear Dynamics.

[6]  H. Nishiura,et al.  Clusters of Coronavirus Disease in Communities, Japan, January–April 2020 , 2020, Emerging infectious diseases.

[7]  Fukang Zhu,et al.  Modelling heavy-tailedness in count time series , 2020 .

[8]  K. Iwata,et al.  Was school closure effective in mitigating coronavirus disease 2019 (COVID-19)? Time series analysis using Bayesian inference , 2020, International Journal of Infectious Diseases.

[9]  G. Espa,et al.  Modelling and predicting the spatio-temporal spread of cOVID-19 in Italy , 2020, BMC Infectious Diseases.

[10]  Leonhard Held,et al.  Probabilistic forecasting in infectious disease epidemiology: the 13th Armitage lecture , 2017, Statistics in medicine.

[11]  C. C. Kokonendji,et al.  Extended Poisson–Tweedie: Properties and regression models for count data , 2016, 1608.06888.

[12]  Leonhard Held,et al.  Spatio-Temporal Analysis of Epidemic Phenomena Using the R Package surveillance , 2014, ArXiv.

[13]  Leonhard Held,et al.  Power-law models for infectious disease spread , 2013, 1308.5115.

[14]  Leonhard Held,et al.  Modeling seasonality in space‐time infectious disease surveillance data , 2012, Biometrical journal. Biometrische Zeitschrift.

[15]  L Held,et al.  Predictive assessment of a non‐linear random effects model for multivariate time series of infectious disease counts , 2011, Statistics in medicine.

[16]  Zhao-liang Wang,et al.  One mixed negative binomial distribution with application , 2011 .

[17]  Claudia Czado,et al.  Predictive Model Assessment for Count Data , 2009, Biometrics.

[18]  L. Held,et al.  Multivariate modelling of infectious disease surveillance data , 2008, Statistics in medicine.

[19]  A. Brockwell,et al.  Universal Residuals: A Multivariate Transformation. , 2007, Statistics & probability letters.

[20]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[21]  J. Lloyd-Smith Maximum Likelihood Estimation of the Negative Binomial Dispersion Parameter for Highly Overdispersed Data, with Applications to Infectious Diseases , 2007, PloS one.

[22]  Leonhard Held,et al.  A statistical framework for the analysis of multivariate infectious disease surveillance counts , 2005 .

[23]  A. W. Kemp,et al.  Univariate Discrete Distributions: Johnson/Univariate Discrete Distributions , 2005 .

[24]  Ruth Williams,et al.  Public perceptions. , 2017, Nursing management.

[25]  Jim Q. Smith,et al.  Diagnostic checks of non‐standard time series models , 1985 .