Tracking disease outbreaks from sparse data with Bayesian inference

The COVID-19 pandemic provides new motivation for a classic problem in epidemiology: estimating the empirical rate of transmission during an outbreak (formally, the time-varying reproduction number) from case counts. While standard methods exist, they work best at coarse-grained national or state scales with abundant data, and struggle to accommodate the partial observability and sparse data common at finer scales (e.g., individual schools or towns). For example, case counts may be sparse when only a small fraction of infections are caught by a testing program. Or, whether an infected individual tests positive may depend on the kind of test and the point in time when they are tested. We propose a Bayesian framework which accommodates partial observability in a principled manner. Our model places a Gaussian process prior over the unknown reproduction number at each time step and models observations sampled from the distribution of a specific testing program. For example, our framework can accommodate a variety of kinds of tests (viral RNA, antibody, antigen, etc.) and sampling schemes (e.g., longitudinal or cross-sectional screening). Inference in this framework is complicated by the presence of tens or hundreds of thousands of discrete latent variables. To address this challenge, we propose an efficient stochastic variational inference method which relies on a novel gradient estimator for the variational objective. Experimental results for an example motivated by COVID-19 show that our method produces an accurate and well-calibrated posterior, while standard methods for estimating the reproduction number can fail badly.

[1]  J. Wallinga,et al.  Different Epidemic Curves for Severe Acute Respiratory Syndrome Reveal Similar Impacts of Control Measures , 2004, American journal of epidemiology.

[2]  Madhav V. Marathe,et al.  An Interaction-Based Approach to Computational Epidemiology , 2008, AAAI.

[3]  M. Pascual,et al.  Inapparent infections and cholera dynamics , 2008, Nature.

[4]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[5]  Michalis K. Titsias,et al.  Bayesian Time Series Models: Markov chain Monte Carlo algorithms for Gaussian processes , 2011 .

[6]  Michael I. Jordan,et al.  Variational Bayesian Inference with Stochastic Search , 2012, ICML.

[7]  Joseph Dureau,et al.  Capturing the time-varying drivers of an epidemic using stochastic dynamical systems. , 2012, Biostatistics.

[8]  C. Fraser,et al.  A New Framework and Software to Estimate Time-Varying Reproduction Numbers During Epidemics , 2013, American journal of epidemiology.

[9]  Yao Zhang,et al.  DAVA: Distributing Vaccines over Networks under Prior Information , 2014, SDM.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Madhav V. Marathe,et al.  Forecasting a Moving Target: Ensemble Models for ILI Case Count Predictions , 2014, SDM.

[12]  Naren Ramakrishnan,et al.  SourceSeer: Forecasting Rare Disease Outbreaks Using Multiple Data Sources , 2015, SDM.

[13]  Sudip Saha,et al.  Approximation Algorithms for Reducing the Spectral Radius to Control Epidemic Spread , 2015, SDM.

[14]  Yao Zhang,et al.  Controlling Propagation at Group Scale on Networks , 2015, 2015 IEEE International Conference on Data Mining.

[15]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[16]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[17]  Arash Vahdat,et al.  DVAE++: Discrete Variational Autoencoders with Overlapping Transformations , 2018, ICML.

[18]  Joseph Dureau,et al.  Accounting for non-stationarity in epidemiology by embedding time-varying parameters in stochastic models , 2018, PLoS Comput. Biol..

[19]  R N Thompson,et al.  Improved inference of time-varying reproduction numbers during infectious disease outbreaks , 2019, Epidemics.

[20]  Thibaut Jombart,et al.  Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data , 2019, PLoS Comput. Biol..

[21]  L. Kucirka,et al.  Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction–Based SARS-CoV-2 Tests by Time Since Exposure , 2020, Annals of Internal Medicine.

[22]  S. Bhatt,et al.  Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe , 2020, Nature.

[23]  Shahin Jabbari,et al.  Modeling between-population variation in COVID-19 dynamics in Hubei, Lombardy, and New York City , 2020, Proceedings of the National Academy of Sciences.

[24]  C. Althaus,et al.  Pattern of early human-to-human transmission of Wuhan 2019 novel coronavirus (2019-nCoV), December 2019 to January 2020 , 2020, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[25]  Milind Tambe,et al.  Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance , 2020, medRxiv : the preprint server for health sciences.

[26]  T. Stadler,et al.  Practical considerations for measuring the effective reproductive number, Rt , 2020, medRxiv.

[27]  K. Mandl,et al.  Early in the epidemic: impact of preprints on global discourse about COVID-19 transmissibility , 2020, The Lancet Global Health.

[28]  Sam Abbott,et al.  Practical considerations for measuring the effective reproductive number, Rt , 2020, PLoS computational biology.

[29]  Sam Abbott,et al.  Estimating the time-varying reproduction number of SARS-CoV-2 using national and subnational case counts , 2020, Wellcome Open Research.

[30]  Galit Alter,et al.  Dynamics and significance of the antibody response to SARS-CoV-2 infection , 2020, medRxiv.

[31]  George Turabelidze,et al.  Seroprevalence of Antibodies to SARS-CoV-2 in 10 Sites in the United States, March 23-May 12, 2020. , 2020, JAMA internal medicine.