Integer‐valued functional data analysis for measles forecasting

Measles presents a unique and imminent challenge for epidemiologists and public health officials: the disease is highly contagious, yet vaccination rates are declining precipitously in many localities. Consequently, the risk of a measles outbreak continues to rise. To improve preparedness, we study historical measles data both pre- and post-vaccine, and design new methodology to forecast measles counts with uncertainty quantification. We propose to model the disease counts as an integer-valued functional time series: measles counts are a function of time-of-year and time-ordered by year. The counts are modeled using a negative-binomial distribution conditional on a real-valued latent process, which accounts for the overdispersion observed in the data. The latent process is decomposed using an unknown basis expansion, which is learned from the data, with dynamic basis coefficients. The resulting framework provides enhanced capability to model complex seasonality, which varies dynamically from year-to-year, and offers improved multi-month ahead point forecasts and substantially tighter forecast intervals (with correct coverage) compared to existing forecasting models. Importantly, the fully Bayesian approach provides well-calibrated and precise uncertainty quantification for epi-relevant features, such as the future value and time of the peak measles count in a given year. An R package is available online. This article is protected by copyright. All rights reserved.

[1]  James M. Hyman,et al.  Forecasting the 2013–2014 Influenza Season Using Wikipedia , 2014, PLoS Comput. Biol..

[2]  J. Schrack,et al.  Generalized multilevel function‐on‐scalar regression and principal component analysis , 2015, Biometrics.

[3]  Peter M Strebel,et al.  Measles - The epidemiology of elimination. , 2014, Vaccine.

[4]  S. Plotkin,et al.  History of vaccination , 2014, Proceedings of the National Academy of Sciences.

[5]  E. Nsoesie,et al.  A systematic review of studies on forecasting the dynamics of influenza outbreaks , 2013, Influenza and other respiratory viruses.

[6]  Brian S. Caffo,et al.  Multilevel functional principal component analysis , 2009 .

[7]  Ciprian M. Crainiceanu,et al.  Bayesian Analysis for Penalized Spline Regression Using WinBUGS , 2005 .

[8]  Daniel R. Kowal Dynamic Function-on-Scalars Regression , 2018, 1806.01460.

[9]  Michael J. Paul,et al.  Twitter Improves Influenza Forecasting , 2014, PLoS currents.

[10]  J. Goldsmith,et al.  Assessing systematic effects of stroke on motor control by using hierarchical function‐on‐scalar regression , 2016, Journal of the Royal Statistical Society. Series C, Applied statistics.

[11]  Ronald Rosenfeld,et al.  Flexible Modeling of Epidemics with an Empirical Bayes Framework , 2014, PLoS Comput. Biol..

[12]  David S. Matteson,et al.  A Bayesian Multivariate Functional Dynamic Linear Model , 2014, 1411.0764.

[13]  Siem Jan Koopman,et al.  A simple and efficient simulation smoother for state space time series analysis , 2002 .

[14]  Peter J Hotez,et al.  Texas and Its Measles Epidemics , 2016, PLoS medicine.

[15]  Arto Klami,et al.  Polya-gamma augmentations for factor models , 2014, ACML.

[16]  David S. Matteson,et al.  Dynamic shrinkage processes , 2017, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[17]  Benjamin D. Dalziel,et al.  Persistent Chaos of Measles Epidemics in the Prevaccination United States Caused by a Small Change in Seasonal Transmission Patterns , 2016, PLoS Comput. Biol..

[18]  Radford M. Neal Slice Sampling , 2003, The Annals of Statistics.

[19]  O. Bjørnstad,et al.  Dynamics of measles epidemics: Estimating scaling of transmission rates using a time series sir model , 2002 .

[20]  Edson Zangiacomi Martinez,et al.  A SARIMA forecasting model to predict the number of cases of dengue in Campinas, State of São Paulo, Brazil. , 2011, Revista da Sociedade Brasileira de Medicina Tropical.

[21]  David B. Dunson,et al.  Lognormal and Gamma Mixed Negative Binomial Regression , 2012, ICML.

[22]  Shawn T. Brown,et al.  Contagious diseases in the United States from 1888 to the present. , 2013, The New England journal of medicine.

[23]  Roland Fried,et al.  tscount: An R package for analysis of count time series following generalized linear models , 2017 .

[24]  Reid Priedhorsky,et al.  Dynamic Bayesian Influenza Forecasting in the United States with Hierarchical Discrepancy (with Discussion) , 2017, Bayesian Analysis.

[25]  Jeffrey S. Morris,et al.  Wavelet‐based functional mixed models , 2006, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[26]  S. Walker Invited comment on the paper "Slice Sampling" by Radford Neal , 2003 .

[27]  Hongxiao Zhu,et al.  Robust, Adaptive Functional Regression in Functional Mixed Model Framework , 2011, Journal of the American Statistical Association.

[28]  R. Tibshirani,et al.  Bayesian backfitting (with comments and a rejoinder by the authors , 2000 .

[29]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[30]  Richard A. Davis,et al.  A negative binomial model for time series of counts , 2009 .

[31]  David S. Matteson,et al.  Functional Autoregression for Sparsely Sampled Data , 2016, 1603.02982.

[32]  Subhashis Ghosal,et al.  Bayesian Estimation of Principal Components for Functional Data , 2017 .

[33]  Daniel R. Kowal,et al.  Bayesian Function-on-Scalars Regression for High-Dimensional Data , 2018, Journal of Computational and Graphical Statistics.

[34]  Fotios Petropoulos,et al.  forecast: Forecasting functions for time series and linear models , 2018 .

[35]  David S. Stoffer,et al.  Time series analysis and its applications , 2000 .

[36]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[37]  D. Dunson,et al.  Sparse Bayesian infinite factor models. , 2011, Biometrika.

[38]  D. Martínez-Bello,et al.  Bayesian dynamic modeling of time series of dengue disease case counts , 2017, PLoS neglected tropical diseases.

[39]  Jeffrey S. Morris Functional Regression , 2014, 1406.4068.

[40]  Paul H. Garthwaite,et al.  Statistical methods for the prospective detection of infectious disease outbreaks: a review , 2012 .

[41]  Rob J Hyndman,et al.  Automatic Time Series Forecasting: The forecast Package for R , 2008 .

[42]  J. Randall Brown,et al.  Rational Arithmetic Mathematica Functions to Evaluate the One-sided One-sample K-S Cumulative Sample Distribution , 2007 .

[43]  Madhav V. Marathe,et al.  A framework for evaluating epidemic forecasts , 2017, BMC Infectious Diseases.