Influenza forecasting for the French regions by using EHR, web and climatic data sources with an ensemble approach ARGONet

Effective and timely disease surveillance systems have the potential to help public health officials design interventions to mitigate the effects of disease outbreaks. Currently, healthcare-based disease monitoring systems in France offer influenza activity information that lags real-time by 1 to 3 weeks. This temporal data gap introduces uncertainty that prevents public health officials from having a timely perspective on the population-level disease activity. Here, we present a machine-learning modeling approach that produces real-time estimates and short-term forecasts of influenza activity for the 12 continental regions of France by leveraging multiple disparate data sources that include, Google search activity, real-time and local weather information, flu-related Twitter micro-blogs, electronic health records data, and historical disease activity synchronicities across regions. Our results show that all data sources contribute to improving influenza surveillance and that machine-learning ensembles that combine all data sources lead to accurate and timely predictions.

[1]  Alina Deshpande,et al.  Global Disease Monitoring and Forecasting with Wikipedia , 2014, PLoS Comput. Biol..

[2]  Mauricio Santillana,et al.  Accurate estimation of influenza epidemics using Google search data via ARGO , 2015, Proceedings of the National Academy of Sciences.

[3]  James M. Hyman,et al.  Forecasting the 2013–2014 Influenza Season Using Wikipedia , 2014, PLoS Comput. Biol..

[4]  Cécile Viboud,et al.  Demonstrating the Use of High-Volume Electronic Medical Claims Data to Monitor Local and Regional Influenza Activity in the US , 2014, PloS one.

[5]  John Steel,et al.  Roles of Humidity and Temperature in Shaping Influenza Seasonality , 2014, Journal of Virology.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Eleftherios Mylonakis,et al.  Google trends: a web-based tool for real-time surveillance of disease outbreaks. , 2009, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[8]  Rumi Chunara,et al.  Flu Near You: Crowdsourced Symptom Reporting Spanning 2 Influenza Seasons. , 2015, American journal of public health.

[9]  Emmanuel Chazard,et al.  Leveraging hospital big data to monitor flu epidemics , 2018, Comput. Methods Programs Biomed..

[10]  John S. Brownstein,et al.  Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time , 2014, PLoS Comput. Biol..

[11]  E. Nsoesie,et al.  A systematic review of studies on forecasting the dynamics of influenza outbreaks , 2013, Influenza and other respiratory viruses.

[12]  David C. Farrow,et al.  Results from the second year of a collaborative effort to forecast influenza seasons in the United States. , 2018, Epidemics.

[13]  Emmanuel Chazard,et al.  Real Time Influenza Monitoring Using Hospital Big Data in Combination with Machine Learning Methods: Comparison Study , 2018, JMIR public health and surveillance.

[14]  J S Brownstein,et al.  Cloud-based Electronic Health Records for Real-time, Region-specific Influenza Surveillance , 2016, Scientific reports.

[15]  Jared Mowery Twitter Influenza Surveillance: Quantifying Seasonal Misdiagnosis Patterns and their Impact on Surveillance Estimates , 2016, Online journal of public health informatics.

[16]  Cécile Viboud,et al.  Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales , 2013, PLoS Comput. Biol..

[17]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[18]  Mauricio Santillana,et al.  Improved state-level influenza activity nowcasting in the United States leveraging Internet-based data sources and network approaches via ARGONet , 2018, bioRxiv.

[19]  Marc Lipsitch,et al.  Inference of seasonal and pandemic influenza transmission dynamics , 2015, Proceedings of the National Academy of Sciences.

[20]  D M Fleming,et al.  The evolution of influenza surveillance in Europe and prospects for the next 10 years. , 2003, Vaccine.

[21]  John Steel,et al.  Influenza Virus Transmission Is Dependent on Relative Humidity and Temperature , 2007, PLoS pathogens.

[22]  Michael J. Paul,et al.  Twitter Improves Influenza Forecasting , 2014, PLoS currents.

[23]  M. Lawrence The relationship between relative humidity and the dewpoint temperature in moist air - A simple conversion and applications , 2005 .

[24]  F. Ellis McKenzie,et al.  Influenza Forecasting in Human Populations: A Scoping Review , 2014, PloS one.

[25]  Alicia Karspeck,et al.  Comparison of Filtering Methods for the Modeling and Retrospective Forecasting of Influenza Epidemics , 2014, PLoS Comput. Biol..

[26]  Y. Moreno,et al.  Unsupervised extraction of epidemic syndromes from participatory influenza surveillance self-reported symptoms , 2019, PLoS computational biology.

[27]  E. Nsoesie,et al.  Using Clinicians’ Search Query Data to Monitor Influenza Epidemics , 2014, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[28]  Mark Dredze,et al.  Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance , 2015, PLoS Comput. Biol..

[29]  D. Cummings,et al.  Strategies for mitigating an influenza pandemic , 2006, Nature.

[30]  Kerrie Mengersen,et al.  Using Google Trends and ambient temperature to predict seasonal influenza outbreaks. , 2018, Environment international.

[31]  Robert L Cook,et al.  Evaluating Google, Twitter, and Wikipedia as Tools for Influenza Surveillance Using Bayesian Change Point Analysis: A Comparative Analysis , 2016, JMIR public health and surveillance.

[32]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.