Knowledge-Based Tweet Classification for Disease Sentiment Monitoring

Disease monitoring and tracking is of tremendous value, not only for containing the spread of contagious diseases but also for avoiding unnecessary public concerns and even panic. In this chapter, we present a near real-time sentiment analysis service of public health-related tweets. Traditionally, it is impossible for humans to effectively measure the degree of public health concerns due to limited resources and significant time delays. To solve this problem, we have developed a computational intelligence approach for Epidemic Sentiment Monitoring System (ESMOS) to automatically analyze the disease sentiments and gauge the Measure of Concern (MOC) expressed by Twitter users. More specifically, we present a knowledge-based approach that employs a disease ontology to detect the outbreak of diseases and to analyze the linguistic expressions that convey subjective expressions and sentiment polarity of emotions, feelings, opinions, personal attitudes, etc. with a sentiment classifier. The two-step sentiment classification method utilizes the subjective vocabulary corpus (MPQA), sentiment strength corpus (AFINN), as well as emoticons and profanity words that are often used in social media postings. It first automatically classifies the tweets into personal and non-personal classes, eliminating many tweets such as non-personal “retweets” of news articles from further consideration. In the second stage, the personal tweets are classified into Negative and non-Negative sentiments. In addition, we present a model to quantify the public’s Measure of Concern (MOC) about a disease, based on sentiment classification results. The trends of the public MOC are visualized on a timeline. Correlation analyses between MOC timeline and disease-related sentiment category timelines show that the peaks of the MOC are weakly correlated with the peaks of the News timeline without any appreciable time delay or lead. Our sentiment analysis method and the MOC trend analyses can be generalized to other topical domains, such as mental health monitoring and crisis management. We present the ESMOS prototype for public health-related disease monitoring, for public concern trending and for mapping analyses.

[1]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[2]  Kiran Bhowmick,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2015 .

[3]  Soon Ae Chun,et al.  Evaluating Ontologies Based on the Naturalness of Their Preferred Terms , 2008, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008).

[4]  Lars Kai Hansen,et al.  Good Friends, Bad News - Affect and Virality in Twitter , 2011, ArXiv.

[5]  G. Eysenbach,et al.  Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 Outbreak , 2010, PloS one.

[6]  Janyce Wiebe,et al.  Annotating Opinions in the World Press , 2003, SIGDIAL Workshop.

[7]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[8]  Harith Alani,et al.  Evaluation Datasets for Twitter Sentiment Analysis: A survey and a new dataset, the STS-Gold , 2013, ESSEM@AI*IA.

[9]  Fredrik Johansson,et al.  Emotion classification of social media posts for estimating people’s reactions to communicated alert messages during crises , 2014, Security Informatics.

[10]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[11]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[12]  Son Doan,et al.  Syndromic Classification of Twitter Messages , 2011, eHealth.

[13]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[14]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[15]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[16]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[17]  Xiuzhen Zhang,et al.  Sentiment Analysis on Twitter through Topic-Based Lexicon Expansion , 2014, ADC.

[18]  Ben Y. Reis,et al.  Surveillance Sans Frontières: Internet-Based Emerging Infectious Disease Intelligence and the HealthMap Project , 2008, PLoS medicine.

[19]  Tiejun Zhao,et al.  Target-dependent Twitter Sentiment Classification , 2011, ACL.

[20]  Marcel Salathé,et al.  Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and Control , 2011, PLoS Comput. Biol..

[21]  Harith Alani,et al.  Automatic Stopword Generation using Contextual Semantics for Sentiment Analysis of Twitter , 2014, International Semantic Web Conference.

[22]  Soon Ae Chun,et al.  Monitoring Public Health Concerns Using Twitter Sentiment Classifications , 2013, 2013 IEEE International Conference on Healthcare Informatics.

[23]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[24]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[25]  Erik Hollnagel,et al.  Cognitive Systems Engineering: New Wine in New Bottles , 1983, Int. J. Man Mach. Stud..

[26]  M. Broersma,et al.  TWITTER AS A NEWS SOURCE , 2013 .

[27]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[28]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[29]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[30]  Uffe Kock Wiil,et al.  Criminal network investigation , 2014, Security Informatics.

[31]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[32]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[33]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[34]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[35]  G. A. Mishne,et al.  Expiriments with mood classification in blog posts , 2005, SIGIR 2005.

[36]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[37]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[38]  Soon Ae Chun,et al.  Twitter sentiment classification for measuring public health concerns , 2015, Social Network Analysis and Mining.

[39]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[40]  Harith Alani,et al.  Adapting Sentiment Lexicons Using Contextual Semantics for Sentiment Analysis of Twitter , 2014, ESWC.

[41]  Guoray Cai,et al.  Detecting public sentiment over PM2.5 pollution hazards through analysis of Chinese microblog , 2014, ISCRAM.

[42]  Alberto Maria Segre,et al.  The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 Pandemic , 2011, PloS one.

[43]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[44]  Stefan Stieglitz,et al.  Twitter data: What do they represent? , 2014, it Inf. Technol..

[45]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[46]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[47]  Danmin Miao,et al.  Changes in emotion of the Chinese public in regard to the SARS period , 2008 .

[48]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[49]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[50]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[51]  Verena Rieser,et al.  An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis , 2014, LREC.

[52]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[53]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[54]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[55]  Fredrik Johansson,et al.  Estimating Citizen Alertness in Crises Using Social Media Monitoring and Analysis , 2012, 2012 European Intelligence and Security Informatics Conference.

[56]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.