Giving Shape to an N-version Dependency Parser - Improving Dependency Parsing Accuracy for Spanish using Maltparser

Maltparser is a contemporary dependency parsing machine learning–based system that shows great accuracy. However 90% of the Labelled Attachment Score (LAS) seems to be a de facto limit for these kinds of parsers. In this paper we present an n–version dependency parser that will work as follows: we found that there is a small set of words that are more frequently incorrectly parsed so the n-version dependency parser consists of n different parsers trained specifically to parse those difficult words. An algorithm will send each word to each parser and combined with the action of a general parser we will achieve better overall accuracy. This work has been developed specifically for Spanish using Maltparser.

[1]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[2]  Pablo Gervás,et al.  JBeaver: Un Analizador de Dependencias para el Español , 2007, Proces. del Leng. Natural.

[3]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[4]  Virginia Francisco,et al.  Towards an N-Version Dependency Parser , 2010, TSD.

[5]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[6]  Pablo Gervás,et al.  Towards a Dependency Parser for Greek Using a Small Training Data Set , 2008, Proces. del Leng. Natural.

[7]  Pablo Gervás,et al.  Building Corpora for the Development of a Dependency Parser for Spanish Using Maltparser , 2007, Proces. del Leng. Natural.

[8]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[9]  Virginia Francisco,et al.  A Feasibility Study on Low Level Techniques for Improving Parsing Accuracy for Spanish Using Maltparser , 2010, SETN.

[10]  Joakim Nivre,et al.  Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines , 2006, CoNLL.

[11]  Mariona Taulé,et al.  AnCora: Multilevel Annotated Corpora for Catalan and Spanish , 2008, LREC.

[12]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[13]  Virginia Francisco,et al.  Improving Parsing Accuracy for Spanish using Maltparser , 2010, Proces. del Leng. Natural.

[14]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[15]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.