Monitoring Public Health Concerns Using Twitter Sentiment Classifications

An important task of public health officials is to keep track of spreading epidemics, and the locations and speed with which they appear. Furthermore, there is interest in understanding how concerned the population is about a disease outbreak. Twitter can serve as an important data source to provide this information in real time. In this paper, we focus on sentiment classification of Twitter messages to measure the Degree of Concern (DOC) of the Twitter users. In order to achieve this goal, we develop a novel two-step sentiment classification workflow to automatically identify personal tweets and negative tweets. Based on this workflow, we present an Epidemic Sentiment Monitoring System (ESMOS) that provides tools for visualizing Twitter users' concern towards different diseases. The visual concern map and chart in ESMOS can help public health officials to identify the progression and peaks of concern for a disease in space and time, so that appropriate preventive actions can be taken. The DOC measure is based on the sentiment-based classifications. We compare clue-based and different Machine Learning methods to classify sentiments of Twitter users regarding diseases, first into personal and neutral tweets and then into negative from neutral personal tweets. In our experiments, Multinomial Naïve Bayes achieved overall the best results and took significantly less time to build the classifier than other methods.

[1]  Danmin Miao,et al.  Changes in emotion of the Chinese public in regard to the SARS period , 2008 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Anirban Mahanti,et al.  Spatio-Temporal Analysis of Topic Popularity in Twitter , 2011, ArXiv.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[6]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[7]  Jehan Wickramasuriya,et al.  Analyzing Twitter for Social TV : Sentiment Extraction for Sports , 2011 .

[8]  Nitesh V. Chawla,et al.  Editorial: special issue on learning from imbalanced data sets , 2004, SKDD.

[9]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[10]  Tobias Günther,et al.  Sentiment Analysis of Microblogs , 2013 .

[11]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[12]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[13]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[14]  Aron Culotta,et al.  Detecting influenza outbreaks by analyzing Twitter messages , 2010, ArXiv.

[15]  J. Brownstein,et al.  Surveillance Sans Frontières: Internet-Based Emerging Infectious Disease Intelligence and the HealthMap Project , 2008, PLoS medicine.

[16]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[17]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[18]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[19]  Janyce Wiebe,et al.  Annotating Opinions in the World Press , 2003, SIGDIAL Workshop.

[20]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[21]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[22]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[23]  Soon Ae Chun,et al.  Epidemic Outbreak and Spread Detection System Based on Twitter Data , 2012, HIS.

[24]  Son Doan,et al.  Syndromic Classification of Twitter Messages , 2011, eHealth.

[25]  Aron Culotta,et al.  Towards detecting influenza epidemics by analyzing Twitter messages , 2010, SOMA '10.

[26]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[27]  Marcel Salathé,et al.  Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and Control , 2011, PLoS Comput. Biol..

[28]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[29]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[30]  Janyce Wiebe,et al.  Annotating Attributions and Private States , 2005, FCA@ACL.