PREPRINT: Using digital epidemiology methods to monitor influenza-like illness in the Netherlands in real-time: the 2017-2018 season

Introduction Despite the early development of Google Flu Trends in 2009, digital epidemiology methods have not been adopted widely, with most research focusing on the USA. In this article we demonstrate the prediction of real-time trends in influenza-like illness (ILI) in the Netherlands using search engine query data. Methods We used flu-related search query data from Google Trends in combination with traditional surveillance data from 40 general sentinel practices to build our predictive models. We introduced an artificial 4-week delay in the use of GP data in the models, in order to test the predictive performance of the search engine data. Simulating the weekly use of a prediction model across the 2017/2018 flu season we used lasso regression to fit 52 prediction models (one for each week) for weekly ILI incidence. We used rolling forecast cross-validation for lambda optimization in each model, minimizing the maximum absolute error. Results The models accurately predicted the number of ILI cases during the 2017/18 ILI epidemic in real time with a mean absolute error of 1.40 (per 10,000 population) and a maximum absolute error of 6.36. The model would also have identified the onset, peak, and end of the epidemic with reasonable accuracy The number of predictors that were retained in the prediction models was small, ranging from 3 to 5, with a single keyword (‘Griep’ = ‘Flu’) having by far the most weight in all models. Discussion This study demonstrates the feasibility of accurate real-time ILI incidence predictions in the Netherlands using internet search query data. Digital ILI monitoring strategies may be useful in countries with poor surveillance systems, or for monitoring emergent diseases, including influenza pandemics. We hope that this transparent and accessible case study inspires and supports further developments in field of digital epidemiology in Europe and beyond.

[1]  M. Vicente,et al.  Monitoring influenza activity in Europe with Google Flu Trends: comparison with the findings of sentinel physician networks - results for 2009-10. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[2]  Cécile Viboud,et al.  Infectious Disease Surveillance in the Big Data Era: Towards Faster and Locally Relevant Systems. , 2016, The Journal of infectious diseases.

[3]  Jesse O'Shea,et al.  Digital disease detection: A systematic review of event-based internet biosurveillance systems , 2017, International Journal of Medical Informatics.

[4]  Alina Deshpande,et al.  Global Disease Monitoring and Forecasting with Wikipedia , 2014, PLoS Comput. Biol..

[5]  Miguel-Angel Sicilia,et al.  Syndromic Surveillance Models Using Web Data: The Case of Influenza in Greece and Italy Using Google Trends , 2017, JMIR public health and surveillance.

[6]  John S. Brownstein,et al.  Wikipedia Usage Estimates Prevalence of Influenza-Like Illness in the United States in Near Real-Time , 2014, PLoS Comput. Biol..

[7]  G. Fitzgerald,et al.  Disease prevention and control , 2009 .

[8]  Madhav V. Marathe,et al.  A framework for evaluating epidemic forecasts , 2017, BMC Infectious Diseases.

[9]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[10]  Caroline O. Buckee,et al.  Digital Epidemiology , 2012, PLoS Comput. Biol..

[11]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[12]  J. Semenza Early Warning System for Infectious Diseases at the European Center for Disease Prevention and Control (ECDC) , 2018, ISEE Conference Abstracts.

[13]  Dirk Eddelbuettel,et al.  R Functions to Perform and Display Google Trends Queries , 2015 .

[14]  Amy M Bovi Use of Health-Related Online Sites , 2003, The American journal of bioethics : AJOB.

[15]  Seung-Pyo Jun,et al.  Ten years of research change using Google Trends: From the perspective of big data utilizations and applications , 2017 .

[16]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[17]  Gail M Williams,et al.  Internet-based surveillance systems for monitoring emerging infectious diseases , 2013, The Lancet Infectious Diseases.

[18]  M. Santillana,et al.  What can digital disease detection learn from (an external revision to) Google Flu Trends? , 2014, American journal of preventive medicine.

[19]  Mark Dredze,et al.  Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance , 2015, PLoS Comput. Biol..

[20]  Tobias Preis,et al.  Adaptive nowcasting of influenza outbreaks using Google searches , 2014, Royal Society Open Science.