A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing

We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art. Our code is open-source and available together with pre-trained models at: https://github.com/ datquocnguyen/jPTDP .

[1]  Joakim Nivre,et al.  Dependency Parsing , 2009, Lang. Linguistics Compass.

[2]  Martin Potthast,et al.  CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.

[3]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[4]  Yuji Matsumoto,et al.  Statistical Dependency Analysis with Support Vector Machines , 2003, IWPT.

[5]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[6]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[7]  Dat Quoc Nguyen,et al.  An empirical study for Vietnamese dependency parsing , 2016, ALTA.

[8]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[9]  Baobao Chang,et al.  An Effective Neural Network Model for Graph-based Dependency Parsing , 2015, ACL.

[10]  Yuan Zhang,et al.  Stack-propagation: Improved Representation Learning for Syntax , 2016, ACL.

[11]  Slav Petrov,et al.  Improved Transition-Based Parsing and Tagging with Neural Networks , 2015, EMNLP.

[12]  Regina Barzilay,et al.  Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing , 2015, HLT-NAACL.

[13]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[14]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[15]  David A. Smith,et al.  A Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing , 2011, ACL.

[16]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[17]  Eliyahu Kiperwasser,et al.  Easy-First Dependency Parsing with Hierarchical Tree LSTMs , 2016, TACL.

[18]  Eduard H. Hovy,et al.  Neural Probabilistic Model for Non-projective MST Parsing , 2017, IJCNLP.

[19]  Jan Hajic,et al.  UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing , 2016, LREC.

[20]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[21]  Haizhou Li,et al.  Joint Models for Chinese POS Tagging and Dependency Parsing , 2011, EMNLP.

[22]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[23]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[24]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[25]  Anders Søgaard,et al.  Joint part-of-speech and dependency projection from multiple sources , 2016, ACL.

[26]  Joakim Nivre,et al.  Analyzing and Integrating Dependency Parsers , 2011, CL.

[27]  Joakim Nivre,et al.  Feature Description for the Transition-Based Parser for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing , 2012 .

[28]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[29]  Barbara Plank,et al.  Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss , 2016, ACL.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[32]  Andrew McCallum,et al.  Transition-based Dependency Parsing with Selectional Branching , 2013, ACL.

[33]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[34]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[35]  Dirk Hovy,et al.  Adapting taggers to Twitter with not-so-distant supervision , 2014, COLING.

[36]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[37]  Jun'ichi Tsujii,et al.  Incremental Joint POS Tagging and Dependency Parsing in Chinese , 2011, IJCNLP.

[38]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[39]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[40]  Yang Liu,et al.  Joint Chinese Word Segmentation, POS Tagging and Parsing , 2012, EMNLP-CoNLL.

[41]  Fernando Pereira,et al.  Online Learning of Approximate Dependency Parsing Algorithms , 2006, EACL.

[42]  Yuji Matsumoto,et al.  Universal Dependencies 2.0 – CoNLL 2017 Shared Task Development and Test Data , 2017 .

[43]  Nianwen Xue,et al.  Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features , 2014, ACL.

[44]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[45]  Veronika Laippala,et al.  Universal Dependencies 1.4 , 2015 .

[46]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[47]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[48]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[49]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[50]  Hai Zhao,et al.  Probabilistic Graph-based Dependency Parsing with Convolutional Neural Network , 2016, ACL.

[51]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[52]  Noah A. Smith,et al.  Deep Multitask Learning for Semantic Dependency Parsing , 2017, ACL.

[53]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[54]  Jianfeng Gao,et al.  Bi-directional Attention with Agreement for Dependency Parsing , 2016, EMNLP.

[55]  Noah A. Smith,et al.  Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers , 2013, ACL.