Vine Pruning for Efficient Multi-Pass Dependency Parsing

Coarse-to-fine inference has been shown to be a robust approximate method for improving the efficiency of structured prediction models while preserving their accuracy. We propose a multi-pass coarse-to-fine architecture for dependency parsing using linear-time vine pruning and structured prediction cascades. Our first-, second-, and third-order models achieve accuracies comparable to those of their unpruned counterparts, while exploring only a fraction of the search space. We observe speed-ups of up to two orders of magnitude compared to exhaustive search. Our pruned third-order model is twice as fast as an unpruned first-order model and also compares favorably to a state-of-the-art transition-based parser for multiple languages.

[1]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[2]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[3]  Colin Cherry,et al.  Fast and Accurate Arc Filtering for Dependency Parsing , 2010, COLING.

[4]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[5]  Dan Klein,et al.  Hierarchical Search for Parsing , 2009, HLT-NAACL.

[6]  Slav Petrov,et al.  Coarse-to-Fine Natural Language Processing , 2011, Theory and Applications of Natural Language Processing.

[7]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[8]  Xavier Carreras,et al.  Experiments with a Higher-Order Projective Dependency Parser , 2007, EMNLP.

[9]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[10]  David Ellis,et al.  Multilevel Coarse-to-Fine PCFG Parsing , 2006, NAACL.

[11]  Y. Singer,et al.  Ultraconservative online algorithms for multiclass problems , 2003 .

[12]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[13]  Stephen J. Wright,et al.  Numerical Optimization , 2018, Fundamental Statistical Inference.

[14]  Harry Bunt,et al.  Advances in Probabilistic and Other Parsing Technologies , 2000 .

[15]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[16]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[17]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[18]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[19]  Ben Taskar,et al.  Structured Prediction Cascades , 2010, AISTATS.

[20]  Lillian Lee,et al.  Fast context-free grammar parsing requires fast boolean matrix multiplication , 2001, JACM.

[21]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[22]  Jason Eisner,et al.  Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[23]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[24]  Xavier Carreras,et al.  TAG, Dynamic Programming, and the Perceptron for Efficient, Feature-Rich Parsing , 2008, CoNLL.

[25]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[26]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[27]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[28]  Noah A. Smith,et al.  Parsing with Soft and Hard Constraints on Dependency Length , 2005 .

[29]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[30]  Dan Klein,et al.  Parsing and Hypergraphs , 2001, IWPT.

[31]  Giorgio Satta,et al.  Dynamic Programming Algorithms for Transition-Based Dependency Parsers , 2011, ACL.

[32]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[33]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.

[34]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[35]  Brian Roark,et al.  Classifying Chart Cells for Quadratic Complexity Context-Free Inference , 2008, COLING.