Performance-oriented dependency parsing

In the last decade a lot of dependency parsers have been developed. This book describes the motivation for the development of yet another parser - MDParser. The state of the art is presented and the deficits of the current developments are discussed. The main problem of the current parsers is that the task of dependency parsing is treated independently of what happens before and after it. However, in practice parsing is rarely done for the sake of parsing itself, but rather in order to use the results in a follow-up application. Additionally, current parsers are accuracy-oriented and focus only on the quality of the results, neglecting other important properties, especially efficiency. The evaluation of some NLP technologies is sometimes as difficult as the task itself. For dependency parsing it was long thought not to be the case, however, some recent works show that the current evaluation possibilities are limited. This book proposes a methodology to account for the weaknesses and combine the strengths of the current approaches. Finally, MDParser is evaluated against other state-of-the-art parsers. The results show that it is the fastest parser currently available and it is able to process plain text, which other parsers usually cannot. The results are slightly behind the top accuracies in the field, however, it is demonstrated that it is not decisive for applications.

[1]  Ben Hutchinson,et al.  Intrinsic versus Extrinsic Evaluations of Parsing Systems , 2003 .

[2]  Udo Hahn,et al.  Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks , 2010, EMNLP.

[3]  Walt Detmar Meurers,et al.  On Detecting Errors in Dependency Treebanks , 2008 .

[4]  Chih-Jen Lin,et al.  A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification , 2010, J. Mach. Learn. Res..

[5]  Lluís Màrquez i Villodre,et al.  Using dependency parsing and machine learning for factoid question answering on spoken documents , 2010, INTERSPEECH.

[6]  Deniz Yuret,et al.  SemEval-2010 Task 12: Parser Evaluation Using Textual Entailments , 2010, *SEMEVAL.

[7]  Alexis Nasr,et al.  Active Learning for Dependency Parsing Using Partially Annotated Sentences , 2011, IWPT.

[8]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 2. The New Algorithm , 1970 .

[9]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[10]  Peng Xu,et al.  Using a Dependency Parser to Improve SMT for Subject-Object-Verb Languages , 2009, NAACL.

[11]  Ronald M. Kaplan,et al.  Lexical Functional Grammar A Formal System for Grammatical Representation , 2004 .

[12]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[13]  Pierre Nugues,et al.  Multilingual Semantic Role Labeling , 2009, CoNLL Shared Task.

[14]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[15]  D. G. Hays Dependency Theory: A Formalism and Some Observations , 1964 .

[16]  Keith Hall,et al.  Corrective Dependency Parsing , 2010, Trends in Parsing Technology.

[17]  Leonidas Georgiadis Arborescence optimization problems solvable by Edmonds' algorithm , 2003, Theor. Comput. Sci..

[18]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[19]  Peter Clark,et al.  The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.

[20]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[21]  Günter Neumann,et al.  Combining Deterministic Dependency Parsing and Linear Classification for Robust RTE , 2010, TAC.

[22]  Joakim Nivre,et al.  Pseudo-Projective Dependency Parsing , 2005, ACL.

[23]  Martha Palmer,et al.  Transition-based Semantic Role Labeling Using Predicate Argument Clustering , 2011, RELMS@ACL.

[24]  Chih-Jen Lin,et al.  A sequential dual method for large scale multi-class linear svms , 2008, KDD.

[25]  Maria Leonor Pacheco,et al.  of the Association for Computational Linguistics: , 2001 .

[26]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[27]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[28]  Vito F. Sinisi,et al.  Entailment: The Logic of Relevance and Necessity , 1996 .

[29]  Chris Quirk,et al.  The impact of parse quality on syntactically-informed statistical machine translation , 2006, EMNLP.

[30]  Günter Neumann,et al.  372: Comparing the Benefit of Different Dependency Parsers for Textual Entailment Using Syntactic Constraints Only , 2010, SemEval@ACL.

[31]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[32]  Markus Dickinson Prune Diseased Branches to Get Healthy Trees ! How to Find Erroneous Local Trees in a Treebank and Why It Matters , 2005 .

[33]  Gisle Ytrestøl,et al.  Optimistic Backtracking - A Backtracking Overlay for Deterministic Incremental Parsing , 2011, ACL.

[34]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[35]  Jun'ichi Tsujii,et al.  Task-oriented Evaluation of Syntactic Parsers and Their Representations , 2008, ACL.

[36]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[37]  Wanxiang Che,et al.  Improving Chinese POS Tagging with Dependency Parsing , 2011, IJCNLP.

[38]  Joakim Nivre,et al.  Integrating Graph-Based and Transition-Based Dependency Parsers , 2008, ACL.

[39]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[40]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[41]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[42]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[43]  C. G. Broyden The Convergence of a Class of Double-rank Minimization Algorithms 1. General Considerations , 1970 .

[44]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[45]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[46]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[47]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[48]  Mihai Surdeanu,et al.  Ensemble Models for Dependency Parsing: Cheap and Good? , 2010, HLT-NAACL.

[49]  Ralph Grishman,et al.  A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.

[50]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[51]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[52]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[53]  Michael A. Covington,et al.  A Fundamental Algorithm for Dependency Parsing , 2004 .

[54]  Walt Detmar Meurers,et al.  Detecting Inconsistencies in Treebanks , 2003 .

[55]  Joakim Nivre,et al.  Evaluation of Dependency Parsers on Unbounded Dependencies , 2010, COLING.

[56]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[57]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[58]  Joakim Nivre,et al.  Inductive Dependency Parsing (Text, Speech and Language Technology) , 2006 .

[59]  Sivaji Bandyopadhyay,et al.  TEXTUAL ENTAILMENT USING LEXICAL AND SYNTACTIC SIMILARITY , 2011 .

[60]  Matthias Scheutz,et al.  Actions Speak Louder than Words: Evaluating Parsers in the Context of Natural Language Understanding Systems for Human-Robot Interaction , 2011, RANLP.

[61]  Hans van Halteren,et al.  The Detection of Inconsistency in Manually Tagged Text , 2000, COLING 2000.

[62]  Martha Palmer,et al.  Getting the Most out of Transition-based Dependency Parsing , 2011, ACL.

[63]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[64]  Joakim Nivre,et al.  Dependency Grammar and Dependency Parsing , 2005 .

[65]  Dekang Lin,et al.  A dependency-based method for evaluating broad-coverage parsers , 1995, Natural Language Engineering.

[66]  Alfred V. Aho,et al.  Deterministic parsing of ambiguous grammars , 1975, Commun. ACM.

[67]  Jun'ichi Tsujii,et al.  Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles , 2007, EMNLP.

[68]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[69]  Emily M. Bender,et al.  Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus , 2011, EMNLP.

[70]  Richard Johansson,et al.  The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies , 2008, CoNLL.

[71]  Jun'ichi Tsujii,et al.  Evaluating contributions of natural language parsers to protein–protein interaction extraction , 2008, Bioinform..

[72]  Ido Dagan,et al.  The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.

[73]  Daniel Gildea,et al.  The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[74]  Yi Zhang,et al.  Cross-Domain Dependency Parsing Using a Deep Linguistic Grammar , 2009, ACL/IJCNLP.

[75]  Lluís Màrquez i Villodre,et al.  SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[76]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[77]  John Blitzer,et al.  Frustratingly Hard Domain Adaptation for Dependency Parsing , 2007, EMNLP.