Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales

The goal of influenza-like illness (ILI) surveillance is to determine the timing, location and magnitude of outbreaks by monitoring the frequency and progression of clinical case incidence. Advances in computational and information technology have allowed for automated collection of higher volumes of electronic data and more timely analyses than previously possible. Novel surveillance systems, including those based on internet search query data like Google Flu Trends (GFT), are being used as surrogates for clinically-based reporting of influenza-like-illness (ILI). We investigated the reliability of GFT during the last decade (2003 to 2013), and compared weekly public health surveillance with search query data to characterize the timing and intensity of seasonal and pandemic influenza at the national (United States), regional (Mid-Atlantic) and local (New York City) levels. We identified substantial flaws in the original and updated GFT models at all three geographic scales, including completely missing the first wave of the 2009 influenza A/H1N1 pandemic, and greatly overestimating the intensity of the A/H3N2 epidemic during the 2012/2013 season. These results were obtained for both the original (2008) and the updated (2009) GFT algorithms. The performance of both models was problematic, perhaps because of changes in internet search behavior and differences in the seasonality, geographical heterogeneity and age-distribution of the epidemics between the periods of GFT model-fitting and prospective use. We conclude that GFT data may not provide reliable surveillance for seasonal or pandemic influenza and should be interpreted with caution until the algorithm can be improved and evaluated. Current internet search query data are no substitute for timely local clinical and laboratory surveillance, or national surveillance based on local data collection. New generation surveillance systems such as GFT should incorporate the use of near-real time electronic health data and computational methods for continued model-fitting and ongoing evaluation and improvement.

[1]  Craig Dalton,et al.  Flutracking: a weekly Australian community online survey of influenza-like illness in 2006, 2007 and 2008. , 2009, Communicable diseases intelligence quarterly report.

[2]  D. Fleming,et al.  Lessons from 40 years' surveillance of influenza in England and Wales , 2007, Epidemiology and Infection.

[3]  A. Flahault,et al.  A routine tool for detection and assessment of epidemics of influenza-like syndromes in France. , 1991, American journal of public health.

[4]  Caroline O. Buckee,et al.  Digital Epidemiology , 2012, PLoS Comput. Biol..

[5]  A D LANGMUIR,et al.  THE EPIDEMIOLOGICAL BASIS FOR THE CONTROL OF INFLUENZA. , 1964, American journal of public health and the nation's health.

[6]  E. Lyons,et al.  Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings , 2009, Science.

[7]  Clay Shirky,et al.  Collecting and sharing data for population health: a new paradigm. , 2009, Health affairs.

[8]  Rishi Desai,et al.  Use of Internet search data to monitor impact of rotavirus vaccination in the United States. , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[9]  Mark A. Miller,et al.  Synchrony, Waves, and Spatial Hierarchies in the Spread of Influenza , 2006, Science.

[10]  Gunther Eysenbach,et al.  Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance , 2006, AMIA.

[11]  M. Vicente,et al.  Monitoring influenza activity in Europe with Google Flu Trends: comparison with the findings of sentinel physician networks - results for 2009-10. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[12]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.

[13]  A. Flahault,et al.  More Diseases Tracked by Using Google Trends , 2009, Emerging infectious diseases.

[14]  D M Fleming,et al.  The evolution of influenza surveillance in Europe and prospects for the next 10 years. , 2003, Vaccine.

[15]  Stefano Merler,et al.  Determinants of the Spatiotemporal Dynamics of the 2009 H1N1 Pandemic in Europe: Implications for Real-Time Modelling , 2011, PLoS Comput. Biol..

[16]  Matthew Mohebbi,et al.  Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) Pandemic , 2011, PloS one.

[17]  N. Wilson,et al.  Interpreting Google flu trends data for pandemic H1N1 influenza: the New Zealand experience. , 2009, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[18]  R. Serfling Methods for current statistical analysis of excess pneumonia-influenza deaths. , 1963, Public health reports.

[19]  Cecile Viboud,et al.  The signature features of influenza pandemics--implications for policy. , 2009, The New England journal of medicine.

[20]  F. Mostashari,et al.  Syndromic surveillance: A local perspective , 2003, Journal of Urban Health.

[21]  N Q Verlander,et al.  Can syndromic thresholds provide early warning of national influenza outbreaks? , 2008, Journal of public health.

[22]  A Flahault,et al.  Sentiweb: French communicable disease surveillance on the world wide web , 1996, BMJ.

[23]  Rod Ellis,et al.  Principles and methodology , 1985 .

[24]  A. Valleron,et al.  A computer network for the surveillance of communicable diseases: the French experiment. , 1986, American journal of public health.

[25]  William B. Lober,et al.  Applying a New Model for Sharing Population Health Data to National Syndromic Influenza Surveillance: DiSTRIBuTE Project Proof of Concept, 2006 to 2009 , 2011, PLoS currents.

[26]  A. Flahault,et al.  Virtual surveillance of communicable diseases: a 20-year experience in France , 2006, Statistical methods in medical research.

[27]  Camille Pelat,et al.  Online detection and quantification of epidemics , 2007, BMC Medical Informatics Decis. Mak..

[28]  Gerardo Chowell,et al.  Severe respiratory disease concurrent with the circulation of H1N1 influenza. , 2009, The New England journal of medicine.

[29]  David M. Pennock,et al.  Using internet searches for influenza surveillance. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[30]  K. Henning,et al.  What is syndromic surveillance? , 2004, MMWR supplements.

[31]  M. Kulldorff,et al.  Syndromic surveillance in public health practice, New York City. , 2004, Emerging infectious diseases.

[32]  Ellen Brooks-Pollock,et al.  Using an online survey of healthcare-seeking behaviour to estimate the magnitude and severity of the 2009 H1N1v influenza epidemic in England , 2011, BMC infectious diseases.

[33]  J. Shaman,et al.  Forecasting seasonal outbreaks of influenza , 2012, Proceedings of the National Academy of Sciences.

[34]  F. Hayden,et al.  Epidemic influenza--responding to the expected but unpredictable. , 2013, The New England journal of medicine.

[35]  Y. Gel,et al.  Influenza Forecasting with Google Flu Trends , 2013, PloS one.

[36]  Wei Zheng,et al.  Potential for early warning of viral influenza activity in the community by monitoring clinical diagnoses of influenza in hospital emergency departments , 2007, BMC public health.

[37]  A Charlett,et al.  QFLU: new influenza monitoring in UK primary care to support pandemic influenza planning. , 2006, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[38]  A. Hulth,et al.  Web Queries as a Source for Syndromic Surveillance , 2009, PloS one.

[39]  Marion P G Koopmans,et al.  Detection of excess influenza severity: associating respiratory hospitalization and mortality data with reports of influenza-like illness by primary care physicians. , 2010, American journal of public health.

[40]  C. Goss,et al.  Monitoring Influenza Activity in the United States: A Comparison of Traditional Surveillance Systems with Google Flu Trends , 2011, PloS one.

[41]  Alicia Karspeck,et al.  Week 49 Influenza Forecast for the 2012-2013 U.S. Season , 2012 .

[42]  Farzad Mostashari,et al.  Monitoring the Impact of Influenza by Age: Emergency Department Fever and Respiratory Complaint Surveillance in New York City , 2007, PLoS medicine.

[43]  W. Dab,et al.  A new influenza surveillance system in France: The Ile-De-France “GROG”. I. Principles and methodology , 1989, European Journal of Epidemiology.

[44]  S. V. van Noort,et al.  Gripenet: an internet-based system to monitor influenza-like illness uniformly across Europe. , 2007, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[45]  Amy Ising,et al.  Searching for better flu surveillance? A brief communication arising from Ginsberg et al. Nature 457, 1012-1014 (2009) , 2009 .

[46]  David L Buckeridge,et al.  Applying a New Model for Sharing Population Health Data to National Syndromic Influenza Surveillance: DiSTRIBuTE Project Proof of Concept, 2006 to 2009. , 2011, PLoS currents.

[47]  S. Rutherford,et al.  Using Google Trends for Influenza Surveillance in South China , 2013, PloS one.

[48]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[49]  L. Simonsen,et al.  The impact of influenza epidemics on mortality: introducing a severity index. , 1997, American journal of public health.

[50]  D. Talan,et al.  Syndromic surveillance for bioterrorism following the attacks on the World Trade Center--New York City, 2001. , 2003, MMWR. Morbidity and mortality weekly report.

[51]  Anette Hulth,et al.  Eye-Opening Approach to Norovirus Surveillance , 2010, Emerging infectious diseases.

[52]  R. Platt,et al.  Using automated medical records for rapid identification of illness syndromes (syndromic surveillance): the example of lower respiratory infection , 2001, BMC public health.