Improving Sequence-to-Sequence Constituency Parsing

Sequence-to-sequence constituency parsing casts the treestructured prediction problem as a general sequential problem by top-down tree linearization, and thus it is very easy to train in parallel with distributed facilities. Despite its success, it relies on a probabilistic attention mechanism for a general purpose, which can not guarantee the selected context to be informative in the specific parsing scenario. Previous work introduced a deterministic attention to select the informative context for sequence-to-sequence parsing, but it is based on the bottom-up linearization even if it was observed that top-down linearization is better than bottom-up linearization for standard sequence-to-sequence constituency parsing. In this paper, we thereby extend the deterministic attention to directly conduct on the top-down tree linearization. Intensive experiments show that our parser delivers substantial improvements over the bottom-up linearization in accuracy, and it achieves 92.3 Fscore on the Penn English Treebank section 23 and 85.4 Fscore on the Penn Chinese Treebank test dataset, without reranking or semi-supervised training.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[3]  Nianwen Xue,et al.  Feature Optimization for Constituent Parsing via Neural Networks , 2015, ACL.

[4]  James Cross,et al.  Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles , 2016, EMNLP.

[5]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[6]  Dan Klein,et al.  Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output , 2012, EMNLP.

[7]  Yue Zhang,et al.  Fast and Accurate Shift-Reduce Constituent Parsing , 2013, ACL.

[8]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[9]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[10]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11]  Nianwen Xue,et al.  Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features , 2014, ACL.

[12]  Yang Liu,et al.  Context Gates for Neural Machine Translation , 2016, TACL.

[13]  Andrew Y. Ng,et al.  Parsing with Compositional Vector Grammars , 2013, ACL.

[14]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[17]  Slav Petrov,et al.  Products of Random Latent Variable Grammars , 2010, NAACL.

[18]  Yue Zhang,et al.  Shift-Reduce Constituent Parsing with Neural Lookahead Features , 2016, TACL.

[19]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[20]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[21]  Taro Watanabe,et al.  Transition-based Neural Constituent Parsing , 2015, ACL.

[22]  Daniel Marcu,et al.  What’s in a translation rule? , 2004, NAACL.

[23]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[24]  Noah A. Smith,et al.  What Do Recurrent Neural Network Grammars Learn About Syntax? , 2016, EACL.

[25]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[26]  Eugene Charniak,et al.  Parsing as Language Modeling , 2016, EMNLP.

[27]  Fei Xia,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Dan Klein,et al.  Neural CRF Parsing , 2015, ACL.

[30]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[31]  Hae-Chang Rim,et al.  Transforming Syntactic Graphs into Semantic Graphs , 1990, ACL.

[32]  Yue Zhang,et al.  Encoder-Decoder Shift-Reduce Syntactic Parsing , 2017, IWPT.

[33]  Lemao Liu,et al.  Deterministic Attention for Sequence-to-Sequence Constituent Parsing , 2017, AAAI.

[34]  Brian Roark,et al.  Efficient probabilistic top-down and left-corner parsing , 1999, ACL.

[35]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.