Top Accuracy and Fast Dependency Parsing is not a Contradiction

In addition to a high accuracy, short parsing and training times are the most important properties of a parser. However, parsing and training times are still relatively long. To determine why, we analyzed the time usage of a dependency parser. We illustrate that the mapping of the features onto their weights in the support vector machine is the major factor in time complexity. To resolve this problem, we implemented the passive-aggressive perceptron algorithm as a Hash Kernel. The Hash Kernel substantially improves the parsing times and takes into account the features of negative examples built during the training. This has lead to a higher accuracy. We could further increase the parsing and training speed with a parallel feature extraction and a parallel parsing algorithm. We are convinced that the Hash Kernel and the parallelization can be applied successful to other NLP applications as well such as transition based dependency parsers, phrase structrue parsers, and machine translation.

[1]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[2]  Yongqiang Li,et al.  Multilingual Dependency-based Syntactic and Semantic Parsing , 2009, CoNLL Shared Task.

[3]  Richard Johansson,et al.  Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank , 2008, CoNLL.

[4]  John Langford,et al.  Hash Kernels for Structured Data , 2009, J. Mach. Learn. Res..

[5]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[6]  Giuseppe Attardi,et al.  Experiments with a Multilanguage Non-Projective Dependency Parser , 2006, CoNLL.

[7]  Bernd Bohnet Efficient Parsing of Syntactic and Semantic Dependency Structures , 2009, CoNLL Shared Task.

[8]  Han Ren,et al.  Parsing Syntactic and Semantic Dependencies for Multiple Languages with A Pipeline Approach , 2009, CoNLL Shared Task.

[9]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[10]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[11]  Jason Eisner,et al.  Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[12]  Ivan Titov,et al.  A Latent Variable Model of Synchronous Syntactic-Semantic Parsing for Multiple Languages , 2009, CoNLL Shared Task.

[13]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[14]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[15]  Joakim Nivre,et al.  Integrating Graph-Based and Transition-Based Dependency Parsers , 2008, ACL.

[16]  Ivan Titov,et al.  A Latent Variable Model for Generative Dependency Parsing , 2007, Trends in Parsing Technology.

[17]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[18]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[19]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[20]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[21]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[22]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[23]  Avrim Blum,et al.  Random Projection, Margins, Kernels, and Feature-Selection , 2005, SLSFS.

[24]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.