Global Transition-based Non-projective Dependency Parsing

Shi, Huang, and Lee (2017a) obtained state-of-the-art results for English and Chinese dependency parsing by combining dynamic-programming implementations of transition-based dependency parsers with a minimal set of bidirectional LSTM features. However, their results were limited to projective parsing. In this paper, we extend their approach to support non-projectivity by providing the first practical implementation of the MH₄ algorithm, an O(n4) mildly nonprojective dynamic-programming parser with very high coverage on non-projective treebanks. To make MH₄ compatible with minimal transition-based feature sets, we introduce a transition-based interpretation of it in which parser items are mapped to sequences of transitions. We thus obtain the first implementation of global decoding for non-projective transition-based parsing, and demonstrate empirically that it is effective than its projective counterpart in parsing a number of highly non-projective languages.

[1]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[2]  Carlos Gómez-Rodríguez,et al.  A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing , 2017, ACL.

[3]  Carlos Gómez-Rodríguez,et al.  Improving Coverage and Runtime Complexity for Exact Inference in Non-Projective Transition-Based Dependency Parsers , 2018, NAACL-HLT.

[4]  James Cross,et al.  Incremental Parsing with Minimal Features Using Bi-Directional LSTM , 2016, ACL.

[5]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[6]  Hai Zhao,et al.  A Transition-based System for Universal Dependency Parsing , 2017, CoNLL Shared Task.

[7]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[8]  Giorgio Satta,et al.  Dynamic Programming Algorithms for Transition-Based Dependency Parsers , 2011, ACL.

[9]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[10]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[11]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[13]  Joakim Nivre,et al.  Deterministic Dependency Parsing of English Text , 2004, COLING.

[14]  Dan Klein,et al.  Parsing with Traces: An O(n4) Algorithm and a Structural Representation , 2017, TACL.

[15]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  David J. Weir,et al.  Parsing Mildly Non-Projective Dependency Structures , 2009, EACL.

[18]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[19]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[20]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[21]  Marco Kuhlmann,et al.  Exploiting Structure in Parsing to 1-Endpoint-Crossing Graphs , 2017, IWPT.

[22]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[23]  Emily Pitler,et al.  A Crossing-Sensitive Third-Order Factorization for Dependency Parsing , 2014, TACL.

[24]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[25]  David J. Weir,et al.  Dependency Parsing Schemata and Mildly Non-Projective Dependency Parsing , 2011, CL.

[26]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[27]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[28]  Giorgio Satta,et al.  Exact Inference for Generative Probabilistic Non-Projective Dependency Parsing , 2011, EMNLP.

[29]  Lillian Lee,et al.  Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set , 2017, EMNLP.

[30]  Weiwei Sun,et al.  Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs , 2017, EMNLP.

[31]  Jan Hajic,et al.  UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[32]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[33]  Emily Pitler,et al.  A Linear-Time Transition System for Crossing Interval Trees , 2015, NAACL.

[34]  Timothy Dozat,et al.  Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task , 2017, CoNLL.

[35]  Yuan Zhang,et al.  Stack-propagation: Improved Representation Learning for Syntax , 2016, ACL.

[36]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[37]  Yao Cheng,et al.  Combining Global Models for Parsing Universal Dependencies , 2017, CoNLL.

[38]  Joseph Le Roux,et al.  Dependency Parsing with Bounded Block Degree and Well-nestedness via Lagrangian Relaxation and Branch-and-Bound , 2016, ACL.

[39]  Giuseppe Attardi,et al.  Experiments with a Multilanguage Non-Projective Dependency Parser , 2006, CoNLL.

[40]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[41]  Sampath Kannan,et al.  Finding Optimal 1-Endpoint-Crossing Trees , 2013, TACL.

[42]  Sampath Kannan,et al.  Dynamic Programming for Higher Order Parsing of Gap-Minding Trees , 2012, EMNLP.

[43]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[44]  Weiwei Sun,et al.  Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs , 2017, ACL.

[45]  Carlos Gómez-Rodríguez Restricted Non-Projectivity: Coverage vs. Efficiency , 2016, Computational Linguistics.

[46]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[47]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[48]  Alexis Nasr,et al.  Pseudo-Projectivity, A Polynomially Parsable Non-Projective Dependency Grammar , 1998, ACL.

[49]  David J. Weir,et al.  A Deductive Approach to Dependency Parsing , 2008, ACL.