Straight to the Tree: Constituency Parsing with Neural Syntactic Distance

In this work, we propose a novel constituency parsing scheme. The model first predicts a real-valued scalar, named syntactic distance, for each split position in the sentence. The topology of grammar tree is then determined by the values of syntactic distances. Compared to traditional shift-reduce parsing schemes, our approach is free from the potentially disastrous compounding error. It is also easier to parallelize and much faster. Our model achieves the state-of-the-art single model F1 score of 92.1 on PTB and 86.4 on CTB dataset, which surpasses the previous single model results by a large margin.

[1]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[2]  Chris Callison-Burch,et al.  Syntactic Constraints on Paraphrases Extracted from Parallel Corpora , 2008, EMNLP.

[3]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[4]  Dan Klein,et al.  Effective Inference for Generative Neural Parsing , 2017, EMNLP.

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Dan Klein,et al.  Neural CRF Parsing , 2015, ACL.

[7]  Joakim Nivre,et al.  Incrementality in Deterministic Dependency Parsing , 2004 .

[8]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[9]  Christopher Potts,et al.  A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.

[10]  Nianwen Xue,et al.  Feature Optimization for Constituent Parsing via Neural Networks , 2015, ACL.

[11]  Joakim Nivre,et al.  A Dynamic Oracle for Arc-Eager Dependency Parsing , 2012, COLING.

[12]  Yue Zhang,et al.  In-Order Transition-based Constituent Parsing , 2017, TACL.

[13]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[14]  Hiroyuki Shindo,et al.  Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing , 2012, ACL.

[15]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[16]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[17]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[18]  Christopher D. Manning,et al.  Efficient, Feature-based, Conditional Random Field Parsing , 2008, ACL.

[19]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[20]  Dan Klein,et al.  Improved Inference for Unlexicalized Parsing , 2007, NAACL.

[21]  Nianwen Xue,et al.  Joint POS Tagging and Transition-based Constituent Parsing in Chinese with Non-local Features , 2014, ACL.

[22]  Yoshimasa Tsuruoka,et al.  Learning to Parse and Translate Improves Neural Machine Translation , 2017, ACL.

[23]  Maximin Coavoux,et al.  Neural Greedy Constituent Parsing with Dynamic Oracles , 2016, ACL.

[24]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[25]  Yue Zhang,et al.  Shift-Reduce Constituent Parsing with Neural Lookahead Features , 2016, TACL.

[26]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[27]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[28]  Dan Klein,et al.  What’s Going On in Neural Constituency Parsers? An Analysis , 2018, NAACL.

[29]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[30]  James Cross,et al.  Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles , 2016, EMNLP.

[31]  Liang Huang,et al.  Forest Reranking: Discriminative Parsing with Non-Local Features , 2008, ACL.

[32]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[33]  Taro Watanabe,et al.  Transition-based Neural Constituent Parsing , 2015, ACL.

[34]  Victor O. K. Li,et al.  Non-Autoregressive Neural Machine Translation , 2017, ICLR.

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Aaron C. Courville,et al.  Neural Language Modeling by Jointly Learning Syntax and Lexicon , 2017, ICLR.

[37]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[38]  Heiga Zen,et al.  Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.