Stacking Dependency Parsers

We explore a stacked framework for learning to predict dependency structures for natural language sentences. A typical approach in graph-based dependency parsing has been to assume a factorized model, where local features are used but a global function is optimized (McDonald et al., 2005b). Recently Nivre and McDonald (2008) used the output of one dependency parser to provide features for another. We show that this is an example of stacked learning, in which a second predictor is trained to improve the performance of the first. Further, we argue that this technique is a novel way of approximating rich non-local features in the second parser, without sacrificing efficient, model-optimal prediction. Experiments on twelve languages show that stacking transition-based and graph-based parsers improves performance over existing state-of-the-art dependency parsers.

[1]  M. Sansalone,et al.  Journal of Research of the National Bureau of Standards , 1959, Nature.

[2]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[3]  Eric Brill,et al.  A corpus-based approach to language learning , 1993 .

[4]  Adwait Ratnaparkhi,et al.  A maximum entropy model for parsing , 1994, ICSLP.

[5]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[6]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[7]  Fernando Pereira,et al.  Relating Probabilistic Grammars and Automata , 1999, ACL.

[8]  Giorgio Satta,et al.  Efficient Parsing for Bilexical Context-Free Grammars and Head Automaton Grammars , 1999, ACL.

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[11]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[12]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[13]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[14]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[15]  William W. Cohen,et al.  Stacked Sequential Learning , 2005, IJCAI.

[16]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[17]  Alon Lavie,et al.  A Classifier-Based Parser with Linear Run-Time Complexity , 2005, IWPT.

[18]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[19]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[20]  Sebastian Riedel,et al.  Incremental Integer Linear Programming for Non-projective Dependency Parsing , 2006, EMNLP.

[21]  Joakim Nivre,et al.  Discriminative Classifiers for Deterministic Dependency Parsing , 2006, ACL.

[22]  Fernando Pereira,et al.  Multilingual Dependency Analysis with a Two-Stage Discriminative Parser , 2006, CoNLL.

[23]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[24]  Joakim Nivre,et al.  Labeled Pseudo-Projective Dependency Parsing with Support Vector Machines , 2006, CoNLL.

[25]  Noah A. Smith,et al.  What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA , 2007, EMNLP.

[26]  William W. Cohen,et al.  Stacked Graphical Models for Efficient Inference in Markov Random Fields , 2007, SDM.

[27]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[28]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.

[29]  Dan Klein,et al.  Structure compilation: trading structure for features , 2008, ICML '08.

[30]  Joakim Nivre,et al.  Integrating Graph-Based and Transition-Based Dependency Parsers , 2008, ACL.

[31]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[32]  Ivan Titov,et al.  A Latent Variable Model for Generative Dependency Parsing , 2007, Trends in Parsing Technology.