Sequence Labeling Parsing by Learning across Representations

We use parsing as sequence labeling as a common framework to learn across constituency and dependency syntactic abstractions. To do so, we cast the problem as multitask learning (MTL). First, we show that adding a parsing paradigm as an auxiliary loss consistently improves the performance on the other paradigm. Secondly, we explore an MTL sequence labeling model that parses both representations, at almost no cost in terms of performance and speed. The results across the board show that on average MTL models with auxiliary losses for constituency parsing outperform single-task ones by 1.05 F1 points, and for dependency parsing by 0.62 UAS points.

[1]  Dan Klein,et al.  Constituency Parsing with a Self-Attentive Encoder , 2018, ACL.

[2]  David Vilares,et al.  Constituent Parsing as Sequence Labeling , 2018, EMNLP.

[3]  Maximin Coavoux,et al.  Neural Greedy Constituent Parsing with Dynamic Oracles , 2016, ACL.

[4]  Miguel Ballesteros,et al.  Effective Morphological Feature Selection with MaltOptimizer at the SPMRL 2013 Shared Task , 2013, SPMRL@EMNLP.

[5]  Yue Zhang,et al.  NCRF++: An Open-source Neural Sequence Labeling Toolkit , 2018, ACL.

[6]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[7]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[8]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[9]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[10]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[11]  André F. T. Martins,et al.  Parsing as Reduction , 2015, ACL.

[12]  Carlos Gómez-Rodríguez,et al.  Left-to-Right Dependency Parsing with Pointer Networks , 2019, NAACL.

[13]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[14]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  David Vilares,et al.  Better, Faster, Stronger Sequence Tagging Constituent Parsers , 2019, NAACL.

[17]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[18]  David Vilares,et al.  Viable Dependency Parsing as Sequence Labeling , 2019, NAACL.

[19]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[20]  Miroslav Spousta,et al.  Dependency Parsing as a Sequence Labeling Task , 2010, Prague Bull. Math. Linguistics.

[21]  Barbara Plank,et al.  Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss , 2016, ACL.

[22]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[23]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[24]  Yuji Matsumoto MaltParser: A language-independent system for data-driven dependency parsing , 2005 .

[25]  Anders Søgaard,et al.  Zero-Shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens , 2018, NAACL.

[26]  Joachim Bingel,et al.  Identifying beneficial task relations for multi-task learning in deep neural networks , 2017, EACL.

[27]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[28]  Jingzhou Liu,et al.  Stack-Pointer Networks for Dependency Parsing , 2018, ACL.

[29]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[30]  Joakim Nivre,et al.  MaltParser: A Data-Driven Parser-Generator for Dependency Parsing , 2006, LREC.

[31]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[32]  Hai Zhao,et al.  Seq2seq Dependency Parsing , 2018, COLING.

[33]  Wolfgang Seeker,et al.  (Re)ranking Meets Morphosyntax: State-of-the-art Results from the SPMRL 2013 Shared Task , 2013, SPMRL@EMNLP.

[34]  Xiao Chen,et al.  Combine Constituent and Dependency Parsing via Reranking , 2013, IJCAI.

[35]  Joachim Bingel,et al.  Sequence Classification with Human Attention , 2018, CoNLL.

[36]  Reut Tsarfaty,et al.  Introducing the SPMRL 2014 Shared Task on Parsing Morphologically-rich Languages , 2014 .

[37]  Ari Rappoport,et al.  Multitask Parsing Across Semantic Representations , 2018, ACL.

[38]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[39]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[40]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[41]  Igor Mel’čuk,et al.  Dependency Syntax: Theory and Practice , 1987 .

[42]  Noah A. Smith,et al.  Dependency Parsing , 2009, Encyclopedia of Artificial Intelligence.

[43]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[44]  Éric Villemonte de la Clergerie,et al.  Exploring beam-based shift-reduce dependency parsing with DyALog: Results from the SPMRL 2013 shared task , 2013, SPMRL@EMNLP.

[45]  Sylvain Kahane,et al.  Syntactic Polygraphs. A Formalism Extending Both Constituency and Dependency , 2015, MOL.

[46]  Maximin Coavoux,et al.  Multilingual Lexicalized Constituency Parsing with Word-Level Auxiliary Tasks , 2017, EACL.