End-to-End Chinese Parsing Exploiting Lexicons

Chinese parsing has traditionally been solved by three pipeline systems including word-segmentation, part-of-speech tagging and dependency parsing modules. In this paper, we propose an end-to-end Chinese parsing model based on character inputs which jointly learns to output word segmentation, part-of-speech tags and dependency structures. In particular, our parsing model relies on word-char graph attention networks, which can enrich the character inputs with external word knowledge. Experiments on three Chinese parsing benchmark datasets show the effectiveness of our models, achieving the state-of-the-art results on end-to-end Chinese parsing.

[1]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[2]  Richard Johansson,et al.  The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages , 2009, CoNLL Shared Task.

[3]  Hao Zhang,et al.  Enforcing Structural Diversity in Cube-pruned Dependency Parsing , 2014, ACL.

[4]  Ming Zhou,et al.  Semantic Parsing with Syntax- and Table-Aware SQL Generation , 2018, ACL.

[5]  Kai Fan,et al.  Lattice Transformer for Speech Translation , 2019, ACL.

[6]  Jun'ichi Tsujii,et al.  Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese , 2012, ACL.

[7]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[8]  M. A. R T A P A L,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[9]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[10]  Timothy Dozat,et al.  Stanford’s Graph-based Neural Dependency Parser at the CoNLL 2017 Shared Task , 2017, CoNLL.

[11]  Regina Barzilay,et al.  Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing , 2015, HLT-NAACL.

[12]  Yue Zhang,et al.  Chinese Parsing Exploiting Characters , 2013, ACL.

[13]  Andrew McCallum,et al.  Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.

[14]  Yoshimasa Tsuruoka,et al.  Improving Chinese Word Segmentation and POS Tagging with Semi-supervised Methods Using Large Auto-Analyzed Data , 2011, IJCNLP.

[15]  Yue Zhang,et al.  Character-Level Chinese Dependency Parsing , 2014, ACL.

[16]  Jing Li,et al.  Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings , 2018, NAACL.

[17]  Jingzhou Liu,et al.  Stack-Pointer Networks for Dependency Parsing , 2018, ACL.

[18]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[19]  Guodong Zhou,et al.  Modeling Graph Structure in Transformer for Better AMR-to-Text Generation , 2019, EMNLP.

[20]  Timothy Dozat,et al.  Deep Biaffine Attention for Neural Dependency Parsing , 2016, ICLR.

[21]  Alex Waibel,et al.  Self-Attentional Models for Lattice Inputs , 2019, ACL.

[22]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[23]  Stephen Clark,et al.  A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model , 2010, EMNLP.

[24]  Nianwen Xue,et al.  Chinese Word Segmentation as LMR Tagging , 2003, SIGHAN.

[25]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[26]  Wei Lu,et al.  Attention Guided Graph Convolutional Networks for Relation Extraction , 2019, ACL.

[27]  Ming Zhou A Block-Based Robust Dependency Parser for Unrestricted Chinese Text , 1999, ACL 2000.

[28]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.

[29]  Makoto Miwa,et al.  End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures , 2016, ACL.

[30]  Eliyahu Kiperwasser,et al.  Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations , 2016, TACL.

[31]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Xipeng Qiu,et al.  A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing , 2019, Transactions of the Association for Computational Linguistics.

[34]  Ying Li,et al.  Self-attentive Biaffine Dependency Parsing , 2019, IJCAI.

[35]  Xuanjing Huang,et al.  DAG-based Long Short-Term Memory for Neural Word Segmentation , 2017, ArXiv.

[36]  Daisuke Kawahara,et al.  Neural Joint Model for Transition-based Chinese Syntactic Analysis , 2017, ACL.

[37]  Hai Zhao,et al.  Neural Character-level Dependency Parsing for Chinese , 2018, AAAI.

[38]  Wei Chu,et al.  Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning , 2020, COLING.

[39]  Hai Zhao,et al.  Fourth-Order Dependency Parsing , 2012, COLING.

[40]  Yue Zhang,et al.  Investigating Self-Attention Network for Chinese Word Segmentation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[41]  Yue Zhang,et al.  Chinese NER Using Lattice LSTM , 2018, ACL.

[42]  Luo Si,et al.  A Neural Multi-digraph Model for Chinese NER with Gazetteers , 2019, ACL.

[43]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[44]  Pablo Gamallo,et al.  Dependency-Based Open Information Extraction , 2012 .

[45]  Bin Wang,et al.  Unified Multi-Criteria Chinese Word Segmentation with BERT , 2020, ArXiv.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[47]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[48]  Robert E. Tarjan,et al.  Finding optimum branchings , 1977, Networks.

[49]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[50]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..