Modeling Review Argumentation for Robust Sentiment Analysis

Most text classification approaches model text at the lexical and syntactic level only, lacking domain robustness and explainability. In tasks like sentiment analysis, such approaches can result in limited effectiveness if the texts to be classified consist of a series of arguments. In this paper, we claim that even a shallow model of the argumentation of a text allows for an effective and more robust classification, while providing intuitive explanations of the classification results. Here, we apply this idea to the supervised prediction of sentiment scores for reviews. We combine existing approaches from sentiment analysis with novel features that compare the overall argumentation structure of the given review text to a learned set of common sentiment flow patterns. Our evaluation in two domains demonstrates the benefit of modeling argumentation for text classification in terms of effectiveness and robustness.

[1]  Qiong Wu,et al.  A Two-Stage Algorithm for Domain Adaptation with Application to Sentiment Transfer Problems , 2010, AIRS.

[2]  Karin Baier,et al.  The Uses Of Argument , 2016 .

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[5]  Patrick Saint-Dizier,et al.  Some Facets of Argument Mining for Opinion Analysis , 2012, COMMA.

[6]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[7]  Gerhard Weikum,et al.  The Bag-of-Opinions Method for Review Rating Prediction from Sparse Text Patterns , 2010, COLING.

[8]  Benno Stein,et al.  A Review Corpus for Argumentation Analysis , 2014, CICLing.

[9]  Ani Nenkova,et al.  Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[10]  Yi Mao,et al.  Isotonic Conditional Random Fields and Local Sentiment Flow , 2006, NIPS.

[11]  Virginia Teller Review of Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition by Daniel Jurafsky and James H. Martin. Prentice Hall 2000. , 2000 .

[12]  Simone Teufel,et al.  Towards Domain-Independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics , 2009, EMNLP.

[13]  Marie-Francine Moens,et al.  Argumentation mining , 2011, Artificial Intelligence and Law.

[14]  Simone Teufel Towards Discipline-Independent Argumentative Zoning : Evidence from Chemistry and Computational Linguistics , 2009 .

[15]  Weng-Keen Wong,et al.  Why-oriented end-user debugging of naive Bayes text classification , 2011, ACM Trans. Interact. Intell. Syst..

[16]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[17]  Matthias Hagen,et al.  Crowdsourcing Interaction Logs to Understand Text Reuse from the Web , 2013, ACL.

[18]  Anind K. Dey,et al.  Assessing demand for intelligibility in context-aware applications , 2009, UbiComp.

[19]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[20]  Benno Stein,et al.  Predicting quality flaws in user-generated content: the case of wikipedia , 2012, SIGIR '12.

[21]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[22]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[23]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[24]  Anthony Hunter,et al.  Elements of Argumentation , 2007, ECSQARU.

[25]  Serena Villata,et al.  Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions , 2012, ACL.

[26]  Efstathios Stamatatos,et al.  A survey of modern authorship attribution methods , 2009, J. Assoc. Inf. Sci. Technol..

[27]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[28]  Thorsten Joachims,et al.  A Statistical Learning Model of Text Classification for Support Vector Machines. , 2001, SIGIR 2002.

[29]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[30]  Ivan Titov,et al.  A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations , 2013, ACL.

[31]  H. Shimodaira,et al.  Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[34]  D. Walton,et al.  The impact of argumentation on artificial intelligence , 2014 .

[35]  Henning Wachsmuth,et al.  Back to the Roots of Genres: Text Classification by Language Function , 2011, IJCNLP.

[36]  Pushpak Bhattacharyya,et al.  Sentiment Analysis in Twitter with Lightweight Discourse Analysis , 2012, COLING.

[37]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[38]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[39]  Uzay Kaymak,et al.  Polarity analysis of texts using discourse structure , 2011, CIKM '11.

[40]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.