Epidemiological Data Challenges: Planning for a More Robust Future Through Data Standards

Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: (1) interfaces, (2) data formatting, and (3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and informatics work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.

[1]  K. Shadan,et al.  Available online: , 2012 .

[2]  T. Stewart,et al.  Preparing your intensive care unit for the second wave of H1N1 and future surges , 2010, Critical care medicine.

[3]  Herbert W. Hethcote,et al.  The Mathematics of Infectious Diseases , 2000, SIAM Rev..

[4]  L. Vold,et al.  Outbreak of norovirus infection in a hotel in Oslo, Norway, January 2011. , 2011, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[5]  R. Nap,et al.  Pandemic Influenza and Hospital Resources , 2007, Emerging infectious diseases.

[6]  Alicia Karspeck,et al.  Real-Time Influenza Forecasts during the 2012–2013 Season , 2013, Nature Communications.

[7]  Shah Khusro,et al.  On methods and tools of table detection, extraction and annotation in PDF documents , 2015, J. Inf. Sci..

[8]  Reid Priedhorsky,et al.  Epidemic Forecasting is Messier Than Weather Forecasting: The Role of Human Behavior and Internet Data Streams in Epidemic Forecast. , 2016, The Journal of infectious diseases.

[9]  David M. Pennock,et al.  Using internet searches for influenza surveillance. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[10]  Michael Edelstein,et al.  Overcoming Barriers to Data Sharing in Public Health: A Global Perspective , 2015 .

[11]  J. Usta When Good Intentions Are Not Enough. , 2017, JAMA pediatrics.

[12]  D. Zhao,et al.  A facile soft-template synthesis of mesoporous polymeric and carbonaceous nanospheres , 2013, Nature Communications.

[13]  C. Rivers,et al.  Make Data Sharing Routine to Prepare for Public Health Emergencies , 2016, PLoS medicine.

[14]  Pierre-Alexandre Bliman,et al.  Epidemiological data accessibility in Brazil. , 2016, The Lancet. Infectious diseases.

[15]  Thibaut Jombart,et al.  EpiJSON: A unified data-format for epidemiology , 2015, Epidemics.

[16]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[17]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[18]  Ronald Rosenfeld,et al.  A human judgment approach to epidemiological forecasting , 2017, PLoS Comput. Biol..

[19]  Aravind Srinivasan,et al.  Modelling disease outbreaks in realistic urban social networks , 2004, Nature.

[20]  Alina Deshpande,et al.  Global Disease Monitoring and Forecasting with Wikipedia , 2014, PLoS Comput. Biol..

[21]  Cécile Viboud,et al.  Prediction of the spread of influenza epidemics by the method of analogues. , 2003, American journal of epidemiology.

[22]  Reid Priedhorsky,et al.  Dynamic Bayesian Influenza Forecasting in the United States with Hierarchical Discrepancy (with Discussion) , 2017, Bayesian Analysis.

[23]  G. Fairchild,et al.  Improving disease surveillance: Sentinel surveillance network design and novel uses of Wikipedia , 2014 .

[24]  Alberto Maria Segre,et al.  Eliciting Disease Data from Wikipedia Articles , 2015, Proceedings of the ... International AAAI Conference on Weblogs and Social Media. International AAAI Conference on Weblogs and Social Media.

[25]  H. Uphoff,et al.  Heterogeneous case definitions used for the surveillance of influenza in Europe , 2002, European Journal of Epidemiology.

[26]  James M. Hyman,et al.  Forecasting the 2013–2014 Influenza Season Using Wikipedia , 2014, PLoS Comput. Biol..

[27]  M. Harrison A Global Perspective , 2015, Bulletin of the history of medicine.

[28]  R. Migliani,et al.  Food-borne outbreak of norovirus infection in a French military parachuting unit, April 2011. , 2011, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[29]  Shawn T. Brown,et al.  FRED (A Framework for Reconstructing Epidemic Dynamics): an open-source software system for modeling infectious diseases and control strategies using census-based populations , 2013, BMC Public Health.

[30]  C. AbouZahr,et al.  Sharing health data: good intentions are not enough. , 2010, Bulletin of the World Health Organization.

[31]  S. Halstead,et al.  Controlling Dengue with Vaccines in Thailand , 2012, PLoS neglected tropical diseases.

[32]  J. Hyman,et al.  Coupling Vector-host Dynamics with Weather Geography and Mitigation Measures to Model Rift Valley Fever in Africa. , 2014, Mathematical modelling of natural phenomena.

[33]  Madhav V. Marathe,et al.  EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems , 2009, ICS.