Early Detection of Influenza outbreaks in the United States

Public health surveillance systems often fail to detect emerging infectious diseases, particularly in resource limited settings. By integrating relevant clinical and internet-source data, we can close critical gaps in coverage and accelerate outbreak detection. Here, we present a multivariate algorithm that uses freely available online data to provide early warning of emerging influenza epidemics in the US. We evaluated 240 candidate predictors and found that the most predictive combination does \textit{not} include surveillance or electronic health records data, but instead consists of eight Google search and Wikipedia pageview time series reflecting changing levels of interest in influenza-related topics. In cross validation on 2010-2016 data, this algorithm sounds alarms an average of 16.4 weeks prior to influenza activity reaching the Center for Disease Control and Prevention (CDC) threshold for declaring the start of the season. In an out-of-sample test on data from the rapidly-emerging fall wave of the 2009 H1N1 pandemic, it recognized the threat five weeks in advance of this surveillance threshold. Simpler algorithms, including fixed week-of-the-year triggers, lag the optimized alarms by only a few weeks when detecting seasonal influenza, but fail to provide early warning in the 2009 pandemic scenario. This demonstrates a robust method for designing next generation outbreak detection algorithms. By combining scan statistics with machine learning, it identifies tractable combinations of data sources (from among thousands of candidates) that can provide early warning of emerging infectious disease threats worldwide.

[1]  Bert Veenendaal,et al.  Applying cusum-based methods for the detection of outbreaks of Ross River virus disease in Western Australia , 2008, BMC Medical Informatics Decis. Mak..

[2]  Michael J. Paul,et al.  National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic , 2013, PloS one.

[3]  E. Lyons,et al.  Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings , 2009, Science.

[4]  J. Brownstein,et al.  Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. , 2012, The American journal of tropical medicine and hygiene.

[5]  D. Buckeridge,et al.  Systematic Review: Surveillance Systems for Early Detection of Bioterrorism-Related Diseases , 2004, Annals of Internal Medicine.

[6]  Stephan Günther,et al.  Emergence of Zaire Ebola virus disease in Guinea. , 2014, The New England journal of medicine.

[7]  M. Karami,et al.  Early Detection of Meningitis Outbreaks: Application of Limited-baseline Data , 2017, Iranian journal of public health.

[8]  Ronald D. Fricker Introduction to Statistical Methods for Biosurveillance: With an Emphasis on Syndromic Surveillance , 2013 .

[9]  M. Smolinski,et al.  Flu Near You: An Online Self-reported Influenza Surveillance System in the USA , 2013, Online Journal of Public Health Informatics.

[10]  Honglong Zhang,et al.  Early detection for hand, foot, and mouth disease outbreaks , 2017 .

[11]  Alicia Karspeck,et al.  Real-Time Influenza Forecasts during the 2012–2013 Season , 2013, Nature Communications.

[12]  W. Shewhart The Economic Control of Quality of Manufactured Product. , 1932 .

[13]  E. Isaacson,et al.  Numerical Analysis for Applied Science , 1997 .

[14]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[15]  Lauren Ancel Meyers,et al.  Optimal multi-source forecasting of seasonal influenza , 2018, PLoS Comput. Biol..

[16]  John Silva,et al.  Biosurveillance Ecosystem (BSVE) Workflow Analysis , 2013, Online Journal of Public Health Informatics.

[17]  J. Crilly,et al.  Prediction and surveillance of influenza epidemics , 2011, The Medical journal of Australia.

[18]  Dimitris Fouskakis,et al.  Surveillance of Community Outbreaks of Respiratory Tract Infections Based on House-Call Visits in the Metropolitan Area of Athens, Greece , 2012, PloS one.

[19]  Emily H. Chan,et al.  Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance , 2011, PLoS neglected tropical diseases.

[20]  S. W. Roberts Control chart tests based on geometric moving averages , 2000 .

[21]  Beth Ann Griffin,et al.  Early detection of influenza outbreaks using the DC Department of Health's syndromic surveillance system , 2009, BMC public health.

[22]  G. Lorden PROCEDURES FOR REACTING TO A CHANGE IN DISTRIBUTION , 1971 .

[23]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[24]  L. Hutwagner,et al.  The bioterrorism preparedness and response Early Aberration Reporting System (EARS) , 2003, Journal of Urban Health.

[25]  Benjamin J Cowling,et al.  Methods for monitoring influenza surveillance data. , 2006, International journal of epidemiology.

[26]  Ramona Lall,et al.  Evaluating and implementing temporal, spatial, and spatio-temporal methods for outbreak detection in a local syndromic surveillance system , 2017, PloS one.

[27]  J. Kammerer,et al.  Using statistical methods and genotyping to detect tuberculosis outbreaks , 2013, International Journal of Health Geographics.

[28]  David L. Buckeridge,et al.  Improving the Performance of Outbreak Detection Algorithms by Classifying the Levels of Disease Incidence , 2013, PloS one.

[29]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.

[30]  Anita M. Pelecanos,et al.  Outbreak detection algorithms for seasonal disease data: a case study using ross river virus disease , 2010, BMC Medical Informatics Decis. Mak..

[31]  Colleen A Bradley,et al.  BioSense: implementation of a National Early Event Detection and Situational Awareness System. , 2005, MMWR supplements.

[32]  Bonnie Berger,et al.  Automated real time constant-specificity surveillance for disease outbreaks , 2007, BMC Medical Informatics Decis. Mak..

[33]  Howard S. Burkom,et al.  Statistical Challenges Facing Early Outbreak Detection in Biosurveillance , 2010, Technometrics.

[34]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[35]  Alessandro Vespignani,et al.  Spread of Zika virus in the Americas , 2017, Proceedings of the National Academy of Sciences.

[36]  James M. Hyman,et al.  Forecasting the 2013–2014 Influenza Season Using Wikipedia , 2014, PLoS Comput. Biol..

[37]  Umar Saif,et al.  FluBreaks: early epidemic detection from Google flu trends. , 2012, Journal of medical Internet research.

[38]  William H. Woodall,et al.  A one‐sided MEWMA chart for health surveillance , 2008, Qual. Reliab. Eng. Int..

[39]  J S Brownstein,et al.  Cloud-based Electronic Health Records for Real-time, Region-specific Influenza Surveillance , 2016, Scientific reports.

[40]  Samarth Swarup,et al.  Combining Participatory Influenza Surveillance with Modeling and Forecasting: Three Alternative Approaches , 2017, JMIR public health and surveillance.

[41]  David L. Buckeridge,et al.  Adjusting outbreak detection algorithms for surveillance during epidemic and non-epidemic periods , 2012, J. Am. Medical Informatics Assoc..

[42]  John S. Brownstein,et al.  Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time , 2014, PLoS Comput. Biol..

[43]  Ronald Rosenfeld,et al.  Flexible Modeling of Epidemics with an Empirical Bayes Framework , 2014, PLoS Comput. Biol..