A Fast and Accurate Dependency Parser using Neural Networks

Almost all current dependency parsers classify based on millions of sparse indicator features. Not only do these features generalize poorly, but the cost of feature computation restricts parsing speed significantly. In this work, we propose a novel way of learning a neural network classifier for use in a greedy, transition-based dependency parser. Because this classifier learns and uses just a small number of dense features, it can work very fast, while achieving an about 2% improvement in unlabeled and labeled attachment scores on both English and Chinese datasets. Concretely, our parser is able to parse more than 1000 sentences per second at 92.2% unlabeled attachment score on the English Penn Treebank.

[1]  Risto Miikkulainen,et al.  SARDSRN: A Neural Network Shift-Reduce Parser , 1999, IJCAI.

[2]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[3]  James Henderson,et al.  Discriminative Training of a Neural Network Statistical Parser , 2004, ACL.

[4]  Risto Miikkulainen,et al.  Broad-Coverage Parsing with Neural Networks , 2005, Neural Processing Letters.

[5]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[6]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[7]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[8]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[9]  Ivan Titov,et al.  Fast and Robust Multilingual Dependency Parsing with a Generative Latent Variable Model , 2007, EMNLP-CoNLL.

[10]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[11]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[12]  Xavier Carreras,et al.  Simple Semi-supervised Dependency Parsing , 2008, ACL.

[13]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[14]  Qun Liu,et al.  Bilingually-Constrained (Monolingual) Shift-Reduce Parsing , 2009, EMNLP.

[15]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[16]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[17]  James Henderson,et al.  Temporal Restricted Boltzmann Machines for Dependency Parsing , 2011, ACL.

[18]  Noah A. Smith,et al.  Dependency Parsing , 2009, Encyclopedia of Artificial Intelligence.

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[21]  Ronan Collobert,et al.  Deep Learning for Efficient Discriminative Parsing , 2011, AISTATS.

[22]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[23]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[24]  Pontus Stenetorp,et al.  Transition-based Dependency Parsing Using Recursive Neural Networks , 2013 .

[25]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[26]  He He,et al.  Dynamic Feature Selection for Dependency Parsing , 2013, EMNLP.

[27]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[28]  Quoc V. Le,et al.  Grounded Compositional Semantics for Finding and Describing Images with Sentences , 2014, TACL.

[29]  Noah A. Smith,et al.  An Empirical Comparison of Parsing Methods for Stanford Dependencies , 2014, ArXiv.

[30]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .