Rhetorical Structure Theory for polarity estimation: An experimental study

Sentiment analysis tools often rely on counts of sentiment-carrying words, ignoring structural aspects of content. Natural Language Processing has been fruitfully exploited in text mining, but advanced discourse processing is still nonpervasive for mining opinions. Some studies, however, extracted opinions based on the discursive role of text segments. The merits of such computationally intensive analyses have thus far been assessed in very specific, small-scale scenarios. In this paper, we investigate the usefulness of Rhetorical Structure Theory in various sentiment analysis tasks on different types of information sources. First, we demonstrate how to perform a large-scale ranking of individual blog posts in terms of their overall polarity, by exploiting the rhetorical structure of a few key evaluative sentences. In order to further validate our findings, we additionally explore the potential of Rhetorical Structure Theory in sentence-level polarity classification of news and product reviews. Our most valuable polarity classification features turn out to capture the way in which polar terms are used, rather than the sentiment-carrying words per se.

[1]  Daniel Marcu,et al.  Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001, SIGDIAL Workshop.

[2]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[3]  Hsinchun Chen,et al.  Evaluating sentiment in financial news articles , 2012, Decis. Support Syst..

[4]  Xiangji Huang,et al.  Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain , 2012, IEEE Transactions on Knowledge and Data Engineering.

[5]  Kim Schouten,et al.  Semantics-based information extraction for detecting economic events , 2012, Multimedia Tools and Applications.

[6]  Uzay Kaymak,et al.  Mining Economic Sentiment Using Argumentation Structures , 2010, ER Workshops.

[7]  Craig MacDonald,et al.  Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval , 2009, ECIR.

[8]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-8: A Step Toward Cross Lingual Opinion Analysis , 2010, NTCIR.

[9]  Oscar Täckström,et al.  Discovering Fine-Grained Sentiment with Latent Variable Structured Prediction Models , 2011, ECIR.

[10]  Iadh Ounis,et al.  The TREC Blogs06 Collection: Creating and Analysing a Blog Test Collection , 2006 .

[11]  Hsin-Hsi Chen,et al.  Overview of Multilingual Opinion Analysis Task at NTCIR-7 , 2008, NTCIR.

[12]  Craig MacDonald,et al.  Ranking opinionated blog posts using OpinionFinder , 2008, SIGIR '08.

[13]  David E. Losada,et al.  Effective and efficient polarity estimation in blogs based on sentence-level evidence , 2011, CIKM '11.

[14]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[15]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[16]  Craig MacDonald,et al.  An effective statistical approach to blog post opinion retrieval , 2008, CIKM '08.

[17]  Iadh Ounis,et al.  Overview of the TREC 2008 Blog Track , 2008, TREC.

[18]  Kai Zhang,et al.  Topic Mining over Asynchronous Text Sequences , 2012, IEEE Transactions on Knowledge and Data Engineering.

[19]  David E. Losada,et al.  A Machine Learning Approach for Subjectivity Classification Based on Positional and Discourse Features , 2013, IRFC.

[20]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[21]  Ahmed Ibrahim,et al.  Rhetorical Representation and Vector Representation in Summarizing Arabic Text , 2013, NLDB.

[22]  Akira Shimazu,et al.  EDU-Based Similarity for Paraphrase Identification , 2013, NLDB.

[23]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[24]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[25]  Javier Parapar,et al.  Finding the Best Parameter Setting Particle Swarm Optimisation , 2012 .

[26]  Nicholas Asher,et al.  Measuring the Effect of Discourse Structure on Sentiment Analysis , 2013, CICLing.

[27]  Bin Zhao,et al.  Identification of collective viewpoints on microblogs , 2013, Data Knowl. Eng..

[28]  Wei Gao,et al.  Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities , 2011, EMNLP.

[29]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[30]  Prem Melville Social Media Analytics: Channeling the Power of the Blogosphere for Marketing Insight , 2009 .

[31]  Kiyoaki Shirai,et al.  Text Classification of Technical Papers Based on Text Segmentation , 2013, NLDB.

[32]  Fabio Crestani,et al.  Investigating Learning Approaches for Blog Post Opinion Retrieval , 2009, ECIR.

[33]  Stefan Conrad,et al.  Linguistic Sentiment Features for Newspaper Opinion Mining , 2013, NLDB.

[34]  Uzay Kaymak,et al.  Polarity analysis of texts using discourse structure , 2011, CIKM '11.

[35]  Hassan Artail,et al.  A general framework for subjective information extraction from unstructured English text , 2007, Data Knowl. Eng..

[36]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[37]  Lise Getoor,et al.  Supervised and Unsupervised Methods in Employing Discourse Relations for Improving Opinion Polarity Classification , 2009, EMNLP.

[38]  G. Meade Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory , 2001 .

[39]  Kimberly D. Voll,et al.  Extracting sentiment as a function of discourse structure and topicality , 2008 .

[40]  Björn W. Schuller,et al.  New Avenues in Opinion Mining and Sentiment Analysis , 2013, IEEE Intelligent Systems.

[41]  Christina Lioma,et al.  Rhetorical relations for information retrieval , 2012, SIGIR '12.

[42]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[43]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[44]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[45]  Fabio Crestani,et al.  Proximity-based opinion retrieval , 2010, SIGIR '10.

[46]  Elisabeth Métais Enhancing information systems management with natural language processing techniques , 2002, Data Knowl. Eng..

[47]  Heiner Stuckenschmidt,et al.  Fine-Grained Sentiment Analysis with Structural Features , 2011, IJCNLP.

[48]  Khurshid Ahmad,et al.  Sentiment Polarity Identification in Financial News: A Cohesion-based Approach , 2007, ACL.

[49]  Stefan Conrad,et al.  Extraction of Statements in News for a Media Response Analysis , 2013, NLDB.

[50]  Rodrygo L. T. Santos,et al.  Information Retrieval on the Blogosphere , 2012, Found. Trends Inf. Retr..

[51]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[52]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.