Experiments with a Higher-Order Projective Dependency Parser

We present experiments with a dependency parsing model defined on rich factors. Our model represents dependency trees with factors that include three types of relations between the tokens of a dependency and their children. We extend the projective parsing algorithm of Eisner (1996) for our case, and train models using the averaged perceptron. Our experiments show that considering higher-order information yields significant improvements in parsing accuracy, but comes at a high cost in terms of both time and memory consumption. In the multilingual exercise of the CoNLL-2007 shared task (Nivre et al., 2007), our system obtains the best accuracy for English, and the second best accuracies for Basque and Czech.

[1]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT' 98.

[3]  Jan Hajic,et al.  Prague Arabic Dependency Treebank: Development in Data and Tools , 2004 .

[4]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[5]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[6]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[7]  Díaz de Ilarraza Construction of a Basque Dependency Treebank , 2003 .

[8]  Dilek Z. Hakkani-Tür,et al.  Building a Turkish Treebank , 2003 .

[9]  János Csirik,et al.  The Szeged Treebank , 2005, TSD.

[10]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[11]  Chu-Ren Huang,et al.  Sinica Treebank: Design Criteria, Representational Issues and Implementation , 2004 .

[12]  Roberto Basili,et al.  Building the Italian Syntactic-Semantic Treebank , 2003 .

[13]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[14]  Jason Eisner,et al.  Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[15]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[16]  Mihai Surdeanu,et al.  Projective Dependency Parsing with Perceptron , 2006, CoNLL.

[17]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[18]  Stelios Piperidis,et al.  Theoretical and Practical Issues in the Construction of a Greek Dependency Treebank , 2005 .