Polarity classification using structure-based vector representations of text

The exploitation of structural aspects of content is becoming increasingly popular in rule-based polarity classification systems. Such systems typically weight the sentiment conveyed by text segments in accordance with these segments' roles in the structure of a text, as identified by deep linguistic processing. Conversely, state-of-the-art machine learning polarity classifiers typically aim to exploit patterns in vector representations of texts, mostly covering the occurrence of words or word groups in these texts. However, since structural aspects of content have been shown to contain valuable information as well, we propose to use structure-based features in vector representations of text. We evaluate the usefulness of our novel features on collections of English reviews in various domains. Our experimental results suggest that, even though word-based features are indispensable to good polarity classifiers, structure-based sentiment information provides valuable additional guidance that can help significantly improve the polarity classification performance of machine learning classifiers. The most informative features capture the sentiment conveyed by specific rhetorical elements that constitute a text's core or provide crucial contextual information. We propose structure-based features for machine learning polarity classification.Adding our features to common word-based features significantly boosts performance.The most informative features capture the sentiment conveyed by rhetorical elements.Useful rhetorical elements form a text's core or provide crucial context information.

[1]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[2]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[3]  Xiangji Huang,et al.  Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain , 2012, IEEE Transactions on Knowledge and Data Engineering.

[4]  Uzay Kaymak,et al.  Determining negation scope and strength in sentiment analysis , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[5]  Kim Schouten,et al.  Semantics-based information extraction for detecting economic events , 2012, Multimedia Tools and Applications.

[6]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[7]  Kimberly D. Voll,et al.  Extracting sentiment as a function of discourse structure and topicality , 2008 .

[8]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[9]  Paolo Rosso,et al.  Making objective decisions from subjective data: Detecting irony in customer reviews , 2012, Decis. Support Syst..

[10]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[11]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[12]  Mike Thelwall,et al.  A Study of Information Retrieval Weighting Schemes for Sentiment Analysis , 2010, ACL.

[13]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[14]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[15]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[16]  Heiner Stuckenschmidt,et al.  Fine-Grained Sentiment Analysis with Structural Features , 2011, IJCNLP.

[17]  Flavius Frasincar,et al.  Towards Cross-Language Sentiment Analysis through Universal Star Ratings , 2012, KMO.

[18]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[19]  Paolo Rosso,et al.  On the difficulty of automatically detecting irony: beyond a simple case of negation , 2014, Knowledge and Information Systems.

[20]  Patricio Martínez-Barco,et al.  Subjectivity and sentiment analysis: An overview of the current state of the area and envisaged developments , 2012, Decis. Support Syst..

[21]  Yorick Wilks,et al.  The grammar of sense: Using part-of-speech tags as a first step in semantic disambiguation , 1998, Natural Language Engineering.

[22]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[23]  Uzay Kaymak,et al.  Polarity analysis of texts using discourse structure , 2011, CIKM '11.

[24]  Kai Zhang,et al.  Topic Mining over Asynchronous Text Sequences , 2012, IEEE Transactions on Knowledge and Data Engineering.

[25]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[26]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[27]  Lina Zhou,et al.  Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[28]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[29]  Janyce Wiebe,et al.  Computing Attitude and Affect in Text: Theory and Applications , 2005, The Information Retrieval Series.

[30]  Khurshid Ahmad,et al.  Sentiment Polarity Identification in Financial News: A Cohesion-based Approach , 2007, ACL.

[31]  Vicki L. Sauter,et al.  Decision Support Systems for Business Intelligence , 2011 .

[32]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[33]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[34]  Flavius Frasincar,et al.  A Statistical Approach to Star Rating Classification of Sentiment , 2012, IS-MiS.

[35]  Nicholas Asher,et al.  Measuring the Effect of Discourse Structure on Sentiment Analysis , 2013, CICLing.

[36]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[37]  Flavius Frasincar,et al.  Sentiment Analysis with a Multilingual Pipeline , 2011, WISE.

[38]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[39]  Samuel W. K. Chan Beyond keyword and cue-phrase matching: A sentence-based abstraction technique for information extraction , 2006, Decis. Support Syst..

[40]  David L. Hicks,et al.  Mining Massive Data Sets for Security , 2008 .

[41]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[42]  Yang Yu,et al.  The impact of social and conventional media on firm equity value: A sentiment analysis approach , 2013, Decis. Support Syst..

[43]  Samuel Madden,et al.  From Databases to Big Data , 2012, IEEE Internet Comput..

[44]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[45]  Chia-Hui Chang,et al.  Automatic information extraction from semi-structured Web pages by pattern discovery , 2003, Decis. Support Syst..

[46]  Uzay Kaymak,et al.  Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[47]  Uzay Kaymak,et al.  A framework for automatic annotation of web pages using the Google rich snippets vocabulary , 2011, SAC.

[48]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[49]  Flavius Frasincar,et al.  An Empirical Study for Determining Relevant Features for Sentiment Summarization of Online Conversational Documents , 2012, WISE.

[50]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[51]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[52]  Andrew B. Whinston,et al.  Whose and what chatter matters? The effect of tweets on movie sales , 2013, Decis. Support Syst..

[53]  Uzay Kaymak,et al.  Multi-lingual support for lexicon-based sentiment analysis guided by semantics , 2014, Decis. Support Syst..

[54]  Uzay Kaymak,et al.  Lexicon-based sentiment analysis by mapping conveyed sentiment to intended sentiment , 2014, Int. J. Web Eng. Technol..

[55]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[56]  David E. Losada,et al.  Sentiment-Based Ranking of Blog Posts Using Rhetorical Structure Theory , 2013, NLDB.

[57]  Andrés Montoyo,et al.  Detecting implicit expressions of emotion in text: A comparative analysis , 2012, Decis. Support Syst..

[58]  Gerhard Weikum,et al.  The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns , 2010, COLING.