Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting Dengue at country and city level

Background Infectious diseases are a leading threat to public health. Accurate and timely monitoring of disease risk and progress can reduce their impact. Mentioning a disease in social networks is correlated with physician visits by patients, and can be used to estimate disease activity. Dengue is the fastest growing mosquito-borne viral disease, with an estimated annual incidence of 390 million infections, of which 96 million manifest clinically. Dengue burden is likely to increase in the future owing to trends toward increased urbanization, scarce water supplies and, possibly, environmental change. The epidemiological dynamic of Dengue is complex and difficult to predict, partly due to costly and slow surveillance systems. Methodology / Principal findings In this study, we aimed to quantitatively assess the usefulness of data acquired by Twitter for the early detection and monitoring of Dengue epidemics, both at country and city level at a weekly basis. Here, we evaluated and demonstrated the potential of tweets modeling for Dengue estimation and forecast, in comparison with other available web-based data, Google Trends and Wikipedia access logs. Also, we studied the factors that might influence the goodness-of-fit of the model. We built a simple model based on tweets that was able to ‘nowcast’, i.e. estimate disease numbers in the same week, but also ‘forecast’ disease in future weeks. At the country level, tweets are strongly associated with Dengue cases, and can estimate present and future Dengue cases until 8 weeks in advance. At city level, tweets are also useful for estimating Dengue activity. Our model can be applied successfully to small and less developed cities, suggesting a robust construction, even though it may be influenced by the incidence of the disease, the activity of Twitter locally, and social factors, including human development index and internet access. Conclusions Tweets association with Dengue cases is valuable to assist traditional Dengue surveillance at real-time and low-cost. Tweets are able to successfully nowcast, i.e. estimate Dengue in the present week, but also forecast, i.e. predict Dengue at until 8 weeks in the future, both at country and city level with high estimation capacity.

[1]  Nicholas D. Preston,et al.  Drivers of Emerging Infectious Disease Events as a Framework for Digital Detection , 2015, Emerging infectious diseases.

[2]  Alan D. Lopez,et al.  Global and regional burden of disease and risk factors, 2001: systematic analysis of population health data , 2006, The Lancet.

[3]  S. Wood Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models , 2011 .

[4]  J. Brownstein,et al.  Internet-based media coverage on dengue in Sri Lanka between 2007 and 2015 , 2016, Global health action.

[5]  John S. Brownstein,et al.  The global distribution and burden of dengue , 2013, Nature.

[6]  Escola Brasileira de Economia,et al.  FUNDAÇÃO GETULIO VARGAS , 2001 .

[7]  T. Endy Human Immune Responses to Dengue Virus Infection: Lessons Learned from Prospective Cohort Studies , 2014, Front. Immunol..

[8]  Caio A. S. Coelho,et al.  Evaluating probabilistic dengue risk forecasts from a prototype early warning system for Brazil , 2016, eLife.

[9]  Giovanini Evelim Coelho,et al.  Sensitivity of the Dengue Surveillance System in Brazil for Detecting Hospitalized Cases , 2016, PLoS neglected tropical diseases.

[10]  Emily H. Chan,et al.  Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance , 2011, PLoS neglected tropical diseases.

[11]  Anavaj Sakuntabhai,et al.  Asymptomatic humans transmit dengue virus to mosquitoes , 2015, Proceedings of the National Academy of Sciences.

[12]  H. Akaike A new look at the statistical model identification , 1974 .

[13]  Hadley Wickham,et al.  ggmap: Spatial Visualization with ggplot2 , 2013, R J..

[14]  M. J. O’Brien,et al.  Mapping collective behavior in the big-data era , 2014, Behavioral and Brain Sciences.

[15]  S. Sacre,et al.  Emerging Role of Endosomal Toll-Like Receptors in Rheumatoid Arthritis , 2013, Front. Immunol..

[16]  Flávio Codeço Coelho,et al.  InfoDengue: a nowcasting system for the surveillance of dengue fever transmission , 2016, bioRxiv.

[17]  Janaina Sant Anna Gomide Mineração de redes sociais para detecção e previsão de eventos reais , 2012 .

[18]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[19]  Samson S. Y. Wong,et al.  Zika virus infection-the next wave after dengue? , 2016, Journal of the Formosan Medical Association = Taiwan yi zhi.

[20]  Virgílio A. F. Almeida,et al.  Dengue surveillance based on a computational model of spatio-temporal locality of Twitter , 2011, WebSci '11.

[21]  Giovanini Evelim Coelho,et al.  Zika virus in the Americas: Early epidemiological and genetic findings , 2016, Science.

[22]  G. Miller Sociology. Social scientists wade into the tweet stream. , 2011, Science.

[23]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[24]  Q. T. ten Bosch,et al.  The Role of Serotype Interactions and Seasonality in Dengue Model Selection and Control: Insights from a Pattern Matching Approach , 2016, PLoS neglected tropical diseases.

[25]  P Reiter,et al.  Climate change and mosquito-borne disease. , 2001, Environmental health perspectives.

[26]  Alina Deshpande,et al.  Global Disease Monitoring and Forecasting with Wikipedia , 2014, PLoS Comput. Biol..

[27]  Maria Glória Teixeira,et al.  Epidemiological Trends of Dengue Disease in Brazil (2000–2010): A Systematic Literature Search and Analysis , 2013, PLoS neglected tropical diseases.

[28]  Gabor Grothendieck,et al.  Lattice: Multivariate Data Visualization with R , 2008 .

[29]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[30]  S. Halstead,et al.  Dengue infection , 2016, Nature Reviews Disease Primers.

[31]  Tobias Preis,et al.  Adaptive nowcasting of influenza outbreaks using Google searches , 2014, Royal Society Open Science.

[32]  John S. Brownstein,et al.  Evaluation of Internet-Based Dengue Query Data: Google Dengue Trends , 2014, PLoS neglected tropical diseases.