Sentiment in short strength detection informal text

A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6p accuracy and negative emotion with 72.8p accuracy, both based upon strength scales of 1–5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches. © 2010 Wiley Periodicals, Inc.

[1]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[2]  Ana M. García-Serrano,et al.  Q-WordNet: Extracting Polarity from WordNet Senses , 2010, LREC.

[3]  Bruno Pouliquen,et al.  Sentiment Analysis in the News , 2010, LREC.

[4]  D. Boyd Taken Out of Context: American Teen Sociality in Networked Publics , 2010 .

[5]  Mike Thelwall,et al.  Data mining emotion in social network communication: Gender differences in MySpace , 2010, J. Assoc. Inf. Sci. Technol..

[6]  Gary King,et al.  A Method of Automated Nonparametric Content Analysis for Social Science , 2010 .

[7]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[8]  Janyce Wiebe,et al.  Articles: Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis , 2009, CL.

[9]  Wolfgang Nejdl,et al.  How valuable is medical social media data? Content analysis of the medical web , 2009, Inf. Sci..

[10]  Rudy Prabowo,et al.  Sentiment analysis: A combined approach , 2009, J. Informetrics.

[11]  Zornitsa Kozareva,et al.  Determining the Polarity and Source of Opinions Expressed in Political Debates , 2009, CICLing.

[12]  Michael D. Robinson,et al.  Measures of emotion: A review , 2009, Cognition & emotion.

[13]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[14]  Jeffrey T. Hancock,et al.  I'm sad you're sad: emotional contagion in CMC , 2008, CSCW.

[15]  Claire Cardie,et al.  Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis , 2008, EMNLP.

[16]  Jeremy C. Short,et al.  The Application of DICTION to Content Analysis Research in Strategic Management , 2008 .

[17]  Hsinchun Chen,et al.  Affect Analysis of Web Forums and Blogs Using Correlation Ensembles , 2008, IEEE Transactions on Knowledge and Data Engineering.

[18]  Daantje Derks,et al.  Emoticons and Online Message Interpretation , 2008 .

[19]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[20]  Daantje Derks,et al.  The role of emotion in computer-mediated communication: A review , 2008, Comput. Hum. Behav..

[21]  Darren Gergle,et al.  Emotion rating from short blog texts , 2008, CHI.

[22]  Carlo Strapparava,et al.  Learning to identify emotions in text , 2008, SAC '08.

[23]  Theresa Wilson Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states , 2008 .

[24]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[25]  Chern Li Liew,et al.  Hunting Suicide Notes in Web 2.0 - Preliminary Findings , 2007, Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007).

[26]  D. Boyd Why Youth (Heart) Social Network Sites: The Role of Networked Publics in Teenage Social Life , 2007 .

[27]  Mitsuru Ishizuka,et al.  Textual Affect Sensing for Sociable and Expressive Online Communication , 2007, ACII.

[28]  François-Régis Chaumartin,et al.  UPAR7: A knowledge-based system for headline sentiment tagging , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[29]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[30]  Shlomo Argamon,et al.  Stylistic text classification using functional lexical features , 2007, J. Assoc. Inf. Sci. Technol..

[31]  Regina Barzilay,et al.  Multiple Aspect Ranking Using the Good Grief Algorithm , 2007, NAACL.

[32]  Siddharth Patwardhan,et al.  Feature Subsumption for Opinion Analysis , 2006, EMNLP.

[33]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[34]  Chung-Hsien Wu,et al.  Emotion recognition from text using semantic labels and separable mixture models , 2006, TALIP.

[35]  Janyce Wiebe,et al.  RECOGNIZING STRONG AND WEAK OPINION CLAUSES , 2006, Comput. Intell..

[36]  L. F. Barrett Valence is a basic building block of emotional life , 2006 .

[37]  Gilad Mishne,et al.  Capturing Global Mood Levels using Blog Posts , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[38]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[39]  Eric K. Ringger,et al.  Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[40]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[41]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[42]  Jenefer Robinson A Sentimental Education , 2005 .

[43]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[44]  BONNIE A. NARDI,et al.  Beyond Bandwidth: Dimensions of Connection in Interpersonal Communication , 2005, Computer Supported Cooperative Work (CSCW).

[45]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[46]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[47]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[48]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[49]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[50]  J. Pennebaker,et al.  Psychological aspects of natural language. use: our words, our selves. , 2003, Annual review of psychology.

[51]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[52]  Rebecca E. Grinter,et al.  Wan2tlk?: everyday text messaging , 2003, CHI '03.

[53]  Felicia A Huppert,et al.  Evidence for the independence of positive and negative well-being: implications for quality of life assessment. , 2003, British journal of health psychology.

[54]  Henry Lieberman,et al.  A model of textual affect sensing using real-world knowledge , 2003, IUI '03.

[55]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[56]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[57]  Martha E. Francis,et al.  Journal of Personality and Social Psychology Linguistic Predictors of Adaptive Bereavement , 2022 .

[58]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[59]  J. Stoppard,et al.  Gender, Context, and Expression of Positive Emotion , 1993 .

[60]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[61]  P. Ekman An argument for basic emotions , 1992 .

[62]  E. Brill A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[63]  D. Watson,et al.  Development and validation of brief measures of positive and negative affect: the PANAS scales. , 1988, Journal of personality and social psychology.

[64]  D. Watson Intraindividual and interindividual analyses of positive and negative affect: their relation to health complaints, perceived stress, and daily activities. , 1988, Journal of personality and social psychology.

[65]  E. Diener,et al.  The independence of positive and negative affect. , 1984, Journal of personality and social psychology.

[66]  Antonio Zamora,et al.  Automatic spelling correction in scientific and scholarly text , 1984, CACM.

[67]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[68]  J. Russell Affective space is bipolar. , 1979 .