Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task

This paper describes the neural dependency parser submitted by Stanford to the CoNLL 2017 Shared Task on parsing Universal Dependencies. Our system uses relatively simple LSTM networks to produce part of speech tags and labeled dependency parses from segmented and tokenized sequences of words. In order to address the rare word problem that abounds in languages with complex morphology, we include a character-based word representation that uses an LSTM to produce embeddings from sequences of characters. Our system was ranked first according to all five relevant metrics for the system: UPOS tagging (93.09%), XPOS tagging (82.27%), unlabeled attachment score (81.30%), labeled attachment score (76.30%), and content word labeled attachment score (72.57%).

[1]  H. S. Heaps,et al.  Information retrieval, computational and theoretical aspects , 1978 .

[2]  L. Cook The Genetical Theory of Natural Selection — A Complete Variorum Edition , 2000, Heredity.

[3]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[4]  Joakim Nivre,et al.  Non-Projective Dependency Parsing in Expected Linear Time , 2009, ACL.

[5]  Jarrod D. Hadfield,et al.  MCMC methods for multi-response generalized linear mixed models , 2010 .

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Benno Stein,et al.  Improving the Reproducibility of PAN's Shared Tasks: - Plagiarism Detection, Author Identification, and Author Profiling , 2014, CLEF.

[8]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[9]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Jianfeng Gao,et al.  Bi-directional Attention with Agreement for Dependency Parsing , 2016, EMNLP.

[12]  Jan Hajic,et al.  Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle Milan , 2016 .

[13]  Barbara Plank,et al.  Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss , 2016, ACL.

[14]  Nando de Freitas,et al.  Neural Programmer-Interpreters , 2015, ICLR.

[15]  Jason Weston,et al.  Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.

[16]  Jan Hajic,et al.  UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[17]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[18]  Kris Cao,et al.  A Joint Model for Word Embedding and Word Morphology , 2016, Rep4NLP@ACL.

[19]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[20]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[21]  Tim Rocktäschel,et al.  Frustratingly Short Attention Spans in Neural Language Modeling , 2017, ICLR.

[22]  Yao Cheng,et al.  Combining Global Models for Parsing Universal Dependencies , 2017, CoNLL.

[23]  Erhard W. Hinrichs,et al.  The parse is darc and full of errors: Universal dependency parsing with transition-based and graph-based algorithms , 2017, CoNLL Shared Task.

[24]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[25]  Mirella Lapata,et al.  Dependency Parsing as Head Selection , 2016, EACL.

[26]  Yoshua Bengio,et al.  Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.

[27]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[28]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Yoshimasa Tsuruoka,et al.  A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[30]  Yuji Matsumoto,et al.  Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data , 2017 .

[31]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.