How Important is Syntactic Parsing Accuracy? An Empirical Evaluation on Sentiment Analysis

Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language text or speech. Sentiment analysis is one example of application for which parsing has recently proven useful. In recent years, there have been significant advances in the accuracy of parsing algorithms. In this article, we perform an empirical, task-oriented evaluation to determine how parsing accuracy influences the performance of a state-of-the-art sentiment analysis system that determines the polarity of sentences from their parse trees. In particular, we evaluate the system using four well-known dependency parsers, including both current models with state-of-the-art accuracy and more innacurate models which, however, require less computational resources. The experiments show that all of the parsers produce similarly good results in the sentiment analysis task, without their accuracy having any relevant influence on the results. Since parsing is currently a task with a relatively high computational cost that varies strongly between algorithms, this suggests that sentiment analysis researchers and users should prioritize speed over accuracy when choosing a parser; and parsing researchers should investigate models that improve speed further, even at some cost to accuracy.

[1]  Miguel A. Alonso,et al.  A syntactic approach for opinion mining on Spanish reviews , 2013, Natural Language Engineering.

[2]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[3]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[4]  Giuseppe Attardi,et al.  Non-projective Dependency-based Pre-Reordering with Recurrent Neural Network for Machine Translation , 2015, ACL.

[5]  Miguel A. Alonso,et al.  A linguistic approach for determining the topics of Spanish Twitter messages , 2015, J. Inf. Sci..

[6]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[7]  Udo Hahn,et al.  Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks , 2010, EMNLP.

[8]  Kevin Knight,et al.  Synchronous Tree Adjoining Machine Translation , 2009, EMNLP.

[9]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[10]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[11]  Deniz Yuret,et al.  SemEval-2010 Task 12: Parser Evaluation Using Textual Entailments , 2010, *SEMEVAL.

[12]  Carlos Gómez-Rodríguez Restricted Non-Projectivity: Coverage vs. Efficiency , 2016, Computational Linguistics.

[13]  Yan Huang,et al.  Bias and Agreement in Syntactic Annotations , 2016, ArXiv.

[14]  Günter Neumann,et al.  Task-oriented dependency parsing evaluation methodology , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[15]  Erik Cambria,et al.  Sentic patterns: Dependency-based rules for concept-level sentiment analysis , 2014, Knowl. Based Syst..

[16]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[17]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.

[18]  Joakim Nivre,et al.  Evaluation of Dependency Parsers on Unbounded Dependencies , 2010, COLING.

[19]  David J. Weir,et al.  A Deductive Approach to Dependency Parsing , 2008, ACL.

[20]  Chris Quirk,et al.  The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[21]  Mark Dredze,et al.  Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction , 2015, HLT-NAACL.

[22]  Usman Qamar,et al.  eSAP: A decision support framework for enhanced sentiment analysis and polarity classification , 2016, Inf. Sci..

[23]  Clement T. Yu,et al.  The effect of negation on sentiment analysis and retrieval effectiveness , 2009, CIKM.

[24]  Sampath Kannan,et al.  Finding Optimal 1-Endpoint-Crossing Trees , 2013, TACL.

[25]  Mohammad Sadegh Rasooli,et al.  Yara Parser: A Fast and Accurate Dependency Parser , 2015, ArXiv.

[26]  Jingbo Zhu,et al.  Syntactic Skeleton-Based Translation , 2016, AAAI.

[27]  Joakim Nivre,et al.  A Dynamic Oracle for Arc-Eager Dependency Parsing , 2012, COLING.

[28]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[29]  Giorgio Satta,et al.  Exact Inference for Generative Probabilistic Non-Projective Dependency Parsing , 2011, EMNLP.

[30]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[31]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[32]  Jonathan K. Kummerfeld,et al.  Large-Scale Syntactic Processing : Parsing the Web Final Report of the 2009 JHU CLSP Workshop , 2009 .

[33]  Regina Barzilay,et al.  Learning to Win by Reading Manuals in a Monte-Carlo Framework , 2011, ACL.

[34]  T. Ishaya,et al.  Negation Identification and Calculation in Sentiment Analysis , 2012 .

[35]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[36]  Eric P. Xing,et al.  Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[37]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[38]  Joakim Nivre,et al.  MaltOptimizer: A System for MaltParser Optimization , 2012, LREC.

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Qian Liu,et al.  Automated rule selection for opinion target extraction , 2016, Knowl. Based Syst..

[41]  Emily M. Bender,et al.  Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus , 2011, EMNLP.

[42]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[43]  Nathan Green,et al.  Influence of Parser Choice on Dependency-Based MT , 2011, WMT@EMNLP.

[44]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[45]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[46]  Miguel A. Alonso,et al.  On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages , 2015, J. Assoc. Inf. Sci. Technol..

[47]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[48]  Xuanjing Huang,et al.  Phrase Dependency Parsing for Opinion Mining , 2009, EMNLP.

[49]  Asher Stern,et al.  Design and realization of a modular architecture for textual entailment , 2013, Natural Language Engineering.

[50]  Jun'ichi Tsujii,et al.  Task-oriented Evaluation of Syntactic Parsers and Their Representations , 2008, ACL.

[51]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[52]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[53]  Maite Taboada,et al.  Analyzing Appraisal Automatically , 2004 .

[54]  Andrew McCallum,et al.  Transition-based Dependency Parsing with Selectional Branching , 2013, ACL.

[55]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[56]  Min Song,et al.  PKDE4J: Entity and relation extraction for public knowledge discovery , 2015, J. Biomed. Informatics.

[57]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[58]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.