Dengue Baidu Search Index data can improve the prediction of local dengue epidemic: A case study in Guangzhou, China

Background Dengue fever (DF) in Guangzhou, Guangdong province in China is an important public health issue. The problem was highlighted in 2014 by a large, unprecedented outbreak. In order to respond in a more timely manner and hence better control such potential outbreaks in the future, this study develops an early warning model that integrates internet-based query data into traditional surveillance data. Methodology and principal findings A Dengue Baidu Search Index (DBSI) was collected from the Baidu website for developing a predictive model of dengue fever in combination with meteorological and demographic factors. Generalized additive models (GAM) with or without DBSI were established. The generalized cross validation (GCV) score and deviance explained indexes, intraclass correlation coefficient (ICC) and root mean squared error (RMSE), were respectively applied to measure the fitness and the prediction capability of the models. Our results show that the DBSI with one-week lag has a positive linear relationship with the local DF occurrence, and the model with DBSI (ICC:0.94 and RMSE:59.86) has a better prediction capability than the model without DBSI (ICC:0.72 and RMSE:203.29). Conclusions Our study suggests that a DSBI combined with traditional disease surveillance and meteorological data can improve the dengue early warning system in Guangzhou.

[1]  Andrew J Tatem,et al.  The changing epidemiology of dengue in China, 1990-2014: a descriptive analysis of 25 years of nationwide surveillance data , 2015, BMC Medicine.

[2]  Tao Liu,et al.  Early detection of an epidemic erythromelalgia outbreak using Baidu search data , 2015, Scientific Reports.

[3]  Li Na,et al.  Gonorrhea incidence forecasting research based on Baidu search data , 2013, 2013 International Conference on Management Science and Engineering 20th Annual Conference Proceedings.

[4]  John S. Brownstein,et al.  The global distribution and burden of dengue , 2013, Nature.

[5]  Shilu Tong,et al.  Surveillance of Dengue Fever Virus: A Review of Epidemiological Models and Early Warning Systems , 2012, PLoS neglected tropical diseases.

[6]  Bernard Cazelles,et al.  Dengue Dynamics in Binh Thuan Province, Southern Vietnam: Periodicity, Synchronicity and Climate Variability , 2010, PLoS neglected tropical diseases.

[7]  Ying Liu,et al.  A preprocessing method of internet search data for prediction improvement: application to Chinese stock market , 2012, DM-IKM '12.

[8]  N. Torbick,et al.  Mapping amyotrophic lateral sclerosis lake risk factors across northern New England , 2014, International Journal of Health Geographics.

[9]  M. Vicente,et al.  Monitoring influenza activity in Europe with Google Flu Trends: comparison with the findings of sentinel physician networks - results for 2009-10. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[10]  George F. Gao,et al.  The Effects of Socioeconomic and Environmental Factors on the Incidence of Dengue Fever in the Pearl River Delta, China, 2013 , 2015, PLoS neglected tropical diseases.

[11]  H M Yang,et al.  Assessing the effects of temperature on dengue transmission , 2009, Epidemiology and Infection.

[12]  Duane J. Gubler,et al.  A Critical Assessment of Vector Control for Dengue Prevention , 2015, PLoS neglected tropical diseases.

[13]  D. Cummings,et al.  Prediction of Dengue Incidence Using Search Query Surveillance , 2011, PLoS neglected tropical diseases.

[14]  M Irshad,et al.  Co-infectivity of hepatitis B virus and hepatitis E virus , 2012, BMC Infectious Diseases.

[15]  S. Cassadou,et al.  Time series analysis of dengue incidence in Guadeloupe, French West Indies: Forecasting models using climate variables as predictors , 2011, BMC infectious diseases.

[16]  S. Rutherford,et al.  Using Google Trends for Influenza Surveillance in South China , 2013, PloS one.

[17]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[18]  O. Horstick,et al.  Modeling tools for dengue risk mapping - a systematic review , 2014, International Journal of Health Geographics.

[19]  Laurent Hébert-Dufresne,et al.  Enhancing disease surveillance with novel data streams: challenges and opportunities , 2015, EPJ Data Science.

[20]  E. Nsoesie,et al.  Monitoring Influenza Epidemics in China with Search Query from Baidu , 2013, PloS one.

[21]  Antonio Lima,et al.  Personalized routing for multitudes in smart cities , 2015, EPJ Data Science.

[22]  Declan Butler,et al.  When Google got flu wrong , 2013, Nature.

[23]  Jun Yang,et al.  Predicting Unprecedented Dengue Outbreak Using Imported Cases and Climatic Factors in Guangzhou, 2014 , 2015, PLoS neglected tropical diseases.

[24]  Xing Li,et al.  Characterizing a large outbreak of dengue fever in Guangdong Province, China , 2016, Infectious Diseases of Poverty.

[25]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[26]  Cécile Viboud,et al.  Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic Scales , 2013, PLoS Comput. Biol..

[27]  S. Halstead,et al.  Dengue virus-mosquito interactions. , 2008, Annual review of entomology.

[28]  Kameshwaran Sampath,et al.  Predicting the Dengue Incidence in Singapore using Univariate Time Series Models , 2013, AMIA.

[29]  Mark Dredze,et al.  Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance , 2015, PLoS Comput. Biol..

[30]  A. Gumel,et al.  Emergency department and ‘Google flu trends’ data as syndromic surveillance indicators for seasonal influenza , 2014, Epidemiology and Infection.

[31]  L. Luo,et al.  The Impacts of Mosquito Density and Meteorological Factors on Dengue Fever Epidemics in Guangzhou, China, 2006-2014: a Time-series Analysis. , 2015, Biomedical and environmental sciences : BES.

[32]  Wenjun Ma,et al.  Community Involvement in Dengue Outbreak Control: An Integrated Rigorous Intervention Strategy , 2016, PLoS neglected tropical diseases.

[33]  John S. Brownstein,et al.  Evaluation of Internet-Based Dengue Query Data: Google Dengue Trends , 2014, PLoS neglected tropical diseases.

[34]  Qian Long,et al.  Implementing a “free” tuberculosis (TB) care policy under the integrated model in Jiangsu, China: practices and costs in the real world , 2016, Infectious Diseases of Poverty.

[35]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[36]  Gail M Williams,et al.  Internet-based surveillance systems for monitoring emerging infectious diseases , 2013, The Lancet Infectious Diseases.

[37]  M. Santillana,et al.  What can digital disease detection learn from (an external revision to) Google Flu Trends? , 2014, American journal of preventive medicine.

[38]  Stephan Karl,et al.  A spatial simulation model for dengue virus infection in urban areas , 2014, BMC Infectious Diseases.

[39]  P. Gething,et al.  Refining the Global Spatial Limits of Dengue Virus Transmission by Evidence-Based Consensus , 2012, PLoS neglected tropical diseases.

[40]  N. Tuno,et al.  Effects of temperature and diet on development and interspecies competition in Aedes aegypti and Aedes albopictus , 2012, Medical and veterinary entomology.

[41]  B. D. de Jong,et al.  Factors associated with mortality in patients with drug-susceptible pulmonary tuberculosis , 2011, BMC infectious diseases.

[42]  Y. Gel,et al.  Influenza Forecasting with Google Flu Trends , 2013, PloS one.

[43]  Rakesh Lodha,et al.  HIV-1 infection and circulating peripheral blood B cell subpopulations , 2014, BMC Infectious Diseases.

[44]  Emily H. Chan,et al.  Using Web Search Query Data to Monitor Dengue Epidemics: A New Model for Neglected Tropical Disease Surveillance , 2011, PLoS neglected tropical diseases.

[45]  M. Boots,et al.  The effects of simulated rainfall on immature population dynamics of Aedes albopictus and female oviposition , 2011, International Journal of Biometeorology.

[46]  David M. Pennock,et al.  Using internet searches for influenza surveillance. , 2008, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[47]  Nawi Ng,et al.  Optimal Lead Time for Dengue Forecast , 2012, PLoS neglected tropical diseases.

[48]  Taha Kass-Hout,et al.  A New Approach to Monitoring Dengue Activity , 2011, PLoS neglected tropical diseases.

[49]  Xiaobo Liu,et al.  Dengue is still an imported disease in China: a case study in Guangzhou. , 2015, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[50]  Lei Luo,et al.  Emergence of dengue virus 4 genotype II in Guangzhou, China, 2010: Survey and molecular epidemiology of one community outbreak , 2012, BMC Infectious Diseases.

[51]  Bin Chen,et al.  Predicting Local Dengue Transmission in Guangzhou, China, through the Influence of Imported Cases, Mosquito Density and Climate Variability , 2014, PloS one.

[52]  Yuming Guo,et al.  Projecting the impact of climate change on dengue transmission in Dhaka, Bangladesh. , 2013, Environment international.

[53]  R. Price,et al.  Artemether-lumefantrine treatment of uncomplicated Plasmodium falciparum malaria: a systematic review and meta-analysis of day 7 lumefantrine concentrations and therapeutic response using individual patient data , 2015, BMC Medicine.

[54]  J. Rocklöv,et al.  Forecast of Dengue Incidence Using Temperature and Rainfall , 2012, PLoS neglected tropical diseases.

[55]  Kerrie Mengersen,et al.  Spatial Patterns and Socioecological Drivers of Dengue Fever Transmission in Queensland, Australia , 2011, Environmental health perspectives.

[56]  A. Schultz,et al.  Outbreaks of gastroenteritis linked to lettuce, Denmark, January 2010. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[57]  D. Lazer,et al.  The Parable of Google Flu: Traps in Big Data Analysis , 2014, Science.