Quantifying the effect of media limitations on outbreak data in a global online web-crawling epidemic intelligence system, 2008–2011

Background This is the first study quantitatively evaluating the effect that media-related limitations have on data from an automated epidemic intelligence system. Methods We modeled time series of HealthMap's two main data feeds, Google News and Moreover, to test for evidence of two potential limitations: first, human resources constraints, and second, high-profile outbreaks “crowding out” coverage of other infectious diseases. Results Google News events declined by 58.3%, 65.9%, and 14.7% on Saturday, Sunday and Monday, respectively, relative to other weekdays. Events were reduced by 27.4% during Christmas/New Years weeks and 33.6% lower during American Thanksgiving week than during an average week for Google News. Moreover data yielded similar results with the addition of Memorial Day (US) being associated with a 36.2% reduction in events. Other holiday effects were not statistically significant. We found evidence for a crowd out phenomenon for influenza/H1N1, where a 50% increase in influenza events corresponded with a 4% decline in other disease events for Google News only. Other prominent diseases in this database – avian influenza (H5N1), cholera, or foodborne illness – were not associated with a crowd out phenomenon. Conclusions These results provide quantitative evidence for the limited impact of editorial biases on HealthMap's web-crawling epidemic intelligence.

[1]  A Lyon,et al.  Comparison of web-based biosecurity intelligence systems: BioCaster, EpiSPIDER and HealthMap. , 2012, Transboundary and emerging diseases.

[2]  M. V. Van Kerkhove,et al.  Influenza-related deaths--available methods for estimating numbers and detecting patterns for seasonal and pandemic influenza in Europe. , 2012, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[3]  S. Davies Nowhere to hide: informal disease surveillance networks tracing state behaviour , 2012 .

[4]  Paul H. Garthwaite,et al.  Statistical methods for the prospective detection of infectious disease outbreaks: a review , 2012 .

[5]  S. Waldman,et al.  Information Needs of Communities: The Changing Media Landscape in a Broadband Age , 2011 .

[6]  David L. Buckeridge,et al.  Information technology and global surveillance of cases of 2009 H1N1 influenza. , 2010, The New England journal of medicine.

[7]  A Mawudeku,et al.  Landscape of international event-based biosurveillance , 2010, Emerging health threats journal.

[8]  Herman D. Tolentino,et al.  Use of Unstructured Event-Based Reports for Global Infectious Disease Surveillance , 2009, Emerging infectious diseases.

[9]  Son Doan,et al.  BioCaster: detecting public health rumors with a Web-based text mining system , 2008, Bioinform..

[10]  Ben Y. Reis,et al.  Surveillance Sans Frontières: Internet-Based Emerging Infectious Disease Intelligence and the HealthMap Project , 2008, PLoS medicine.

[11]  Stephen S Morse,et al.  Global infectious disease surveillance and health intelligence. , 2007, Health affairs.

[12]  E. Armstrong,et al.  Whose deaths matter? Mortality, advocacy, and attention to disease in the mass media. , 2006, Journal of health politics, policy and law.

[13]  Suzanne Franks,et al.  The CARMA Report: Western Media Coverage of Humanitarian Disasters , 2006 .

[14]  Douglas Ahlers News Consumption and the New Electronic Media , 2006 .

[15]  L. Madoff,et al.  The internet and the global monitoring of emerging diseases: lessons from the first 10 years of ProMED-mail. , 2005, Archives of medical research.

[16]  Kenneth D. Mandl,et al.  Time series modeling for syndromic surveillance , 2003, BMC Medical Informatics Decis. Mak..

[17]  G. Rodier,et al.  Hot spots in a wired world: WHO surveillance of emerging and re-emerging infectious diseases. , 2001, The Lancet. Infectious diseases.

[18]  L. Verbrugge,et al.  Death makes news: the social impact of disease on newspaper coverage. , 2000, Journal of health and social behavior.

[19]  D. Roter,et al.  "If it bleeds it leads"? Attributes of TV health news stories that drive viewer attention. , 2000, Public health reports.

[20]  Michael J. Ryan,et al.  Rumors of disease in the global village: outbreak verification. , 2000, Emerging infectious diseases.

[21]  Susan D. Moeller Compassion Fatigue: How the Media Sell Disease, Famine, War and Death , 1994 .

[22]  M. Wickens,et al.  A Survey of Some Recent Econometric Methods , 1989 .

[23]  P. Phillips Testing for a Unit Root in Time Series Regression , 1988 .

[24]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[25]  S B Thacker,et al.  An evaluation of influenza mortality surveillance, 1962-1979. I. Time series forecasts of expected pneumonia and influenza deaths. , 1981, American journal of epidemiology.

[26]  W. Fuller,et al.  Distribution of the Estimators for Autoregressive Time Series with a Unit Root , 1979 .

[27]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[28]  Michael Blench Global Public Health Intelligence Network (GPHIN) , 2008, AMTA.

[29]  David L. Hicks,et al.  Mining Massive Data Sets for Security , 2008 .

[30]  Paul A. Fontelo,et al.  Scanning the Emerging Infectious Diseases Horizon - Visualizing ProMED Emails Using EpiSPIDER , 2007 .

[31]  Ralf Steinberger,et al.  Text Mining from the Web for Medical Intelligence , 2007, NATO ASI Mining Massive Data Sets for Security.

[32]  Philip Meyer,et al.  The Vanishing Newspaper: Saving Journalism in the Information Age , 2004 .

[33]  J. Baffes Explaining stationary variables with non-stationary regressors , 1997 .

[34]  Lorrie Faith Cranor,et al.  email , 1995, CROS.

[35]  Eleanor Singer,et al.  Reporting on Risk: How the Mass Media Portray Accidents, Diseases, Disasters and Other Hazards , 1993 .

[36]  U Helfenstein,et al.  Box-Jenkins modelling of some viral infectious diseases. , 1986, Statistics in medicine.

[37]  R. Halvorsen,et al.  The Interpretation of Dummy Variables in Semilogarithmic Equations , 1980 .