Analysis of Web Access Logs for Surveillance of Influenza

The purpose of this study was to determine whether the level of influenza in a population correlates with the number of times that internet users access information about influenza on health-related Web sites. We obtained Web access logs from the Healthlink Web site. Web access logs contain information about the user and the information the user accessed, and are maintained electronically by most Web sites, including Healthlink. We developed weekly counts of the number of accesses of selected influenza-related articles on the Healthlink Web site and measured their correlation with traditional influenza surveillance data from the Centers for Disease Control and Prevention (CDC) using the cross-correlation function (CCF). We defined timeliness as the time lag at which the correlation was a maximum. There was a moderately strong correlation between the frequency of influenza-related article accesses and the CDC's traditional surveillance data, but the results on timeliness were inconclusive. With improvements in methods for performing spatial analysis of the data and the continuing increase in Web searching behavior among Americans, Web article access has the potential to become a useful data source for public health early warning systems.

[1]  J. Marc Overhage,et al.  Research Paper: Detection of Pediatric Respiratory and Diarrheal Outbreaks from Sales of Over-the-counter Electrolyte Products , 2003, J. Am. Medical Informatics Assoc..

[2]  Fu-Chiang Tsui,et al.  Application of Information Technology: Design of a National Retail Data Monitor for Public Health Surveillance , 2003, J. Am. Medical Informatics Assoc..

[3]  Hargraves Jl,et al.  Seeking health care information: most consumers still on the sidelines. , 2003 .

[4]  Michael M. Wagner,et al.  Telephone Triage: A Timely Data Source for Surveillance of Influenza-like Diseases , 2003, AMIA.

[5]  Michael M. Wagner,et al.  Automatic Electronic Laboratory-Based Reporting of Notifiable Infectious Diseases , 2002, Emerging infectious diseases.

[6]  Wendy W. Chapman,et al.  Accuracy of three classifiers of acute gastrointestinal syndrome for syndromic surveillance , 2002, AMIA.

[7]  Michael M. Wagner,et al.  Modeling the Effects of Epidemics on Routinely Collected Data , 2002, J. Am. Medical Informatics Assoc..

[8]  Michael M. Wagner,et al.  Value of ICD-9-Coded Chief Complaints for Detection of Epidemics , 2002, J. Am. Medical Informatics Assoc..

[9]  Dean F. Sittig,et al.  The emerging science of very early detection of disease outbreaks. , 2001, Journal of public health management and practice : JPHMP.

[10]  Michael M. Wagner,et al.  Accuracy of ICD-9-coded chief complaints and diagnoses for the detection of acute respiratory illness , 2001, AMIA.