Reconstructing the MERS disease outbreak from news

Disease surveillance is critical for mobilizing health care resources and deciding on isolation measures to contain the spread of infectious diseases. Because ground truth signals of rare and deadly diseases are sparse, it can be useful to enrich surveillance systems using measures of social and environmental factors which are known to influence the spread of a disease. One approach to measure such factors is by using real time news streams. In this study, we model the epidemiological transmission of the Middle Eastern Respiratory Syndrome (MERS) disease during the outbreak that occurred from 2013 to 2018 in the Arabian peninsula. Using the GDELT news event database, we show that conflict related signals allow us to reconstruct the time series of newly infected cases per week. This reduces the residual sum of squared errors by a factor of 3.36 as compared to a standard epidemiological model. We also capture interpretable time-sensitive factors which illustrate the importance of using real time news stream to model the evolution of a disease such as MERS and facilitate early and effective policy interventions.

[1]  Kenneth D. Mandl,et al.  HealthMap: Global Infectious Disease Monitoring through Automated Classification and Visualization of Internet Media Reports , 2008, Journal of the American Medical Informatics Association.

[2]  Yan Liu,et al.  Temporal causal modeling with graphical granger methods , 2007, KDD '07.

[3]  Gerardo Chowell,et al.  Synthesizing data and models for the spread of MERS-CoV, 2013: Key role of index cases and hospital transmission , 2014, Epidemics.

[4]  Lakshminarayanan Subramanian,et al.  Fine-grained dengue forecasting using telephone triage services , 2016, Science Advances.

[5]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[6]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.

[7]  Peter Wallensteen,et al.  Armed Conflict 1946-2001: A New Dataset , 2002 .

[8]  Erik Melander,et al.  Organized violence, 1989–2015 , 2016 .

[9]  Xiaomo Liu,et al.  "Breaking" Disasters: Predicting and Characterizing the Global News Value of Natural and Man-made Disasters , 2017, ArXiv.

[10]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[11]  L. Madoff ProMED-mail: an early warning system for emerging diseases. , 2004, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[12]  Benny Yong,et al.  Dynamical transmission model of MERS-CoV in two areas , 2016, AIP conference proceedings.

[13]  G. Eysenbach Infodemiology and Infoveillance: Framework for an Emerging Set of Public Health Informatics Methods to Analyze Search, Communication and Publication Behavior on the Internet , 2009, Journal of medical Internet research.

[14]  Ashlynn R. Daughton,et al.  An approach to and web-based tool for infectious disease outbreak intervention analysis , 2017, Scientific Reports.

[15]  W. W. Muir,et al.  Regression Diagnostics: Identifying Influential Data and Sources of Collinearity , 1980 .

[16]  Son Doan,et al.  BioCaster: detecting public health rumors with a Web-based text mining system , 2008, Bioinform..

[17]  J. Watmough,et al.  Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. , 2002, Mathematical biosciences.

[18]  Ashlynn R. Daughton,et al.  Corrigendum: An approach to and web-based tool for infectious disease outbreak intervention analysis , 2017, Scientific reports.

[19]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[20]  Virgílio A. F. Almeida,et al.  Dengue surveillance based on a computational model of spatio-temporal locality of Twitter , 2011, WebSci '11.