Vesicular stomatitis forecasting based on Google Trends

Background Vesicular stomatitis (VS) is an important viral disease of livestock. The main feature of VS is irregular blisters that occur on the lips, tongue, oral mucosa, hoof crown and nipple. Humans can also be infected with vesicular stomatitis and develop meningitis. This study analyses 2014 American VS outbreaks in order to accurately predict vesicular stomatitis outbreak trends. Methods American VS outbreaks data were collected from OIE. The data for VS keywords were obtained by inputting 24 disease-related keywords into Google Trends. After calculating the Pearson and Spearman correlation coefficients, it was found that there was a relationship between outbreaks and keywords derived from Google Trends. Finally, the predicted model was constructed based on qualitative classification and quantitative regression. Results For the regression model, the Pearson correlation coefficients between the predicted outbreaks and actual outbreaks are 0.953 and 0.948, respectively. For the qualitative classification model, we constructed five classification predictive models and chose the best classification predictive model as the result. The results showed, SN (sensitivity), SP (specificity) and ACC (prediction accuracy) values of the best classification predictive model are 78.52%,72.5% and 77.14%, respectively. Conclusion This study applied Google search data to construct a qualitative classification model and a quantitative regression model. The results show that the method is effective and that these two models obtain more accurate forecast.

[1]  Z. Bu,et al.  Protective efficacy of a recombinant Newcastle disease virus expressing glycoprotein of vesicular stomatitis virus in mice , 2016, Virology Journal.

[2]  Max Kuhn,et al.  Applied Predictive Modeling , 2013 .

[3]  Y. Liao,et al.  Determinants of the Incidence of Hand, Foot and Mouth Disease in China Using Geographically Weighted Regression Models , 2012, PloS one.

[4]  G. Lyons,et al.  Bayesian ANN classifier for ECG arrhythmia diagnostic system: a comparison study , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[5]  M. Vicente,et al.  Monitoring influenza activity in Europe with Google Flu Trends: comparison with the findings of sentinel physician networks - results for 2009-10. , 2010, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[6]  Jieping Ye,et al.  Tuberculosis Surveillance by Analyzing Google Trends , 2011, IEEE Transactions on Biomedical Engineering.

[7]  Sung Il Hwang,et al.  Pre-Operative Prediction of Advanced Prostatic Cancer Using Clinical Decision Support Systems: Accuracy Comparison between Support Vector Machine and Artificial Neural Network , 2011, Korean journal of radiology.

[8]  S. Rutherford,et al.  Using Google Trends for Influenza Surveillance in South China , 2013, PloS one.

[9]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[10]  Y. Gel,et al.  Influenza Forecasting with Google Flu Trends , 2013, PloS one.

[11]  J C Barrera,et al.  Persistence of vesicular stomatitis virus New Jersey RNA in convalescent hamsters. , 1996, Virology.

[12]  K. Chou,et al.  Prediction of linear B-cell epitopes using amino acid pair antigenicity scale , 2007, Amino Acids.

[13]  Seong-Won Lee,et al.  BAYESNET: Bayesian classification network based on biased random competition using Gaussian kernels , 1993, IEEE International Conference on Neural Networks.

[14]  Trevor Hastie,et al.  Multi-class AdaBoost ∗ , 2009 .

[15]  Kai Goebel,et al.  Sensitivity of fusion performance to classifier model variations , 2003, SPIE Defense + Commercial Sensing.

[16]  Dhanashree S. Medhekar,et al.  Heart Disease Prediction System using Naive Bayes , 2013 .

[17]  Robert E. Schapire,et al.  How boosting the margin can also boost classifier complexity , 2006, ICML.

[18]  Guo-Zheng Li,et al.  Using AdaBoost for the prediction of subcellular location of prokaryotic and eukaryotic proteins , 2008, Molecular Diversity.

[19]  Dennis KM Ip,et al.  A profile of the online dissemination of national influenza surveillance data , 2009, BMC public health.

[20]  Ian H. Witten,et al.  Weka-A Machine Learning Workbench for Data Mining , 2005, Data Mining and Knowledge Discovery Handbook.

[21]  David P. Helmbold,et al.  A geometric approach to leveraging weak learners , 1999, Theor. Comput. Sci..

[22]  B. McCluskey,et al.  Review of the 1997 outbreak of vesicular stomatitis in the western United States. , 1999, Journal of the American Veterinary Medical Association.

[23]  Eleftherios Mylonakis,et al.  Google trends: a web-based tool for real-time surveillance of disease outbreaks. , 2009, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[24]  G J Letchworth,et al.  Vesicular stomatitis. , 1999, Veterinary journal.

[25]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[26]  Valérie Bourdès,et al.  Comparison of Artificial Neural Network with Logistic Regression as Classification Models for Variable Selection for Prediction of Breast Cancer Patient Outcomes , 2010, Adv. Artif. Neural Syst..

[27]  C. G. Moore,et al.  Epizootic vesicular stomatitis in Colorado, 1982: epidemiologic and entomologic studies. , 1987, The American journal of tropical medicine and hygiene.

[28]  J. Aucott,et al.  The utility of "Google Trends" for epidemiological research: Lyme disease as an example. , 2010, Geospatial health.

[29]  A. Flahault,et al.  More Diseases Tracked by Using Google Trends , 2009, Emerging infectious diseases.

[30]  Reza Samavi,et al.  M4CVD: Mobile Machine Learning Model for Monitoring Cardiovascular Disease , 2015, EUSPN/ICTH.

[31]  A. Subasi,et al.  Comparison of linear regression and neural network models forecasting tourist arrivalsto Turkey , 2012 .

[32]  D F Klein,et al.  The reliability of a decision tree technique applied to psychiatric diagnosis. , 1972, Biometrics.

[33]  Stephen A Berger,et al.  GIDEON: a comprehensive Web-based resource for geographic medicine , 2005, International journal of health geographics.

[34]  John S. Brownstein,et al.  Evaluation of Internet-Based Dengue Query Data: Google Dengue Trends , 2014, PLoS neglected tropical diseases.

[35]  C. Goss,et al.  Monitoring Influenza Activity in the United States: A Comparison of Traditional Surveillance Systems with Google Flu Trends , 2011, PloS one.

[36]  G. Carrasquilla,et al.  Predictors of local malaria outbreaks: an approach to the development of an early warning system in Colombia. , 2011, Memorias do Instituto Oswaldo Cruz.

[37]  Kuo-Chen Chou,et al.  Predicting protein structural class with AdaBoost Learner. , 2006, Protein and peptide letters.

[38]  Tolga Tasdizen,et al.  Automatic markup of neural cell membranes using boosted decision stumps , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[39]  Qin Chen,et al.  2D-SAR and 3D-QSAR analyses for acetylcholinesterase inhibitors , 2017, Molecular Diversity.