Expected F-Measure Training for Shift-Reduce Parsing with Recurrent Neural Networks

Xu acknowledges the Carnegie Trust for the Universities of Scotland and the Cambridge Trusts for funding. Clark is supported by ERC Starting Grant DisCoTex (306920) and EPSRC grant EP/I037512/1.

[1]  Risto Miikkulainen,et al.  Subsymbolic Case-Role Analysis of Sentences With Embedded Clauses , 1993, Cogn. Sci..

[2]  Jianfeng Gao,et al.  Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models , 2014, ACL.

[3]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[4]  Jianfeng Gao,et al.  Training MRF-Based Phrase Translation Models using Gradient Ascent , 2013, NAACL.

[5]  Yue Zhang,et al.  Fast and Accurate Shift-Reduce Constituent Parsing , 2013, ACL.

[6]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[7]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[8]  Jianfeng Gao,et al.  Large-scale Expected BLEU Training of Phrase-based Reordering Models , 2014, EMNLP.

[9]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[10]  Joakim Nivre,et al.  Algorithms for Deterministic Incremental Dependency Parsing , 2008, CL.

[11]  Eliyahu Kiperwasser,et al.  Easy-First Dependency Parsing with Hierarchical Tree LSTMs , 2016, TACL.

[12]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[13]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[14]  Stephen Clark,et al.  Shift-Reduce CCG Parsing with a Dependency Model , 2014, ACL.

[15]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[16]  Noah A. Smith,et al.  Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions , 2010, NAACL.

[17]  Ronan Collobert,et al.  Joint RNN-Based Greedy Parsing and Word Composition , 2014, ICLR.

[18]  Yue Zhang,et al.  A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing , 2015, ACL.

[19]  Li Deng,et al.  Maximum Expected BLEU Training of Phrase and Lexicon Translation Models , 2012, ACL.

[20]  Joshua Goodman,et al.  Parsing Algorithms and Metrics , 1996, ACL.

[21]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[22]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[23]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[24]  David J. Weir,et al.  Parsing Some Constrained Grammar Formalisms , 1993, Comput. Linguistics.

[25]  Gerald Penn,et al.  Accurate Context-Free Parsing with Combinatory Categorial Grammar , 2010, ACL.

[26]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[27]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[28]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[29]  Alon Lavie,et al.  A Best-First Probabilistic Shift-Reduce Parser , 2006, ACL.

[30]  Adam Lopez,et al.  Training a Log-Linear Parser with Loss Functions via Softmax-Margin , 2011, EMNLP.

[31]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[32]  Ashish Vaswani,et al.  Efficient Structured Inference for Transition-Based Parsing with Neural Networks and Error States , 2016, Transactions of the Association for Computational Linguistics.

[33]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[34]  Joakim Nivre,et al.  Deterministic Dependency Parsing of English Text , 2004, COLING.

[35]  Stephen Clark,et al.  CCG Supertagging with a Recurrent Neural Network , 2015, ACL.

[36]  Risto Miikkulainen,et al.  SARDSRN: A Neural Network Shift-Reduce Parser , 1999, IJCAI.

[37]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[38]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[39]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[40]  Slav Petrov,et al.  Structured Training for Neural Network Transition-Based Parsing , 2015, ACL.

[41]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[42]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[43]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[44]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[45]  Giorgio Satta,et al.  A New Parsing Algorithm for Combinatory Categorial Grammar , 2014, Transactions of the Association for Computational Linguistics.

[46]  Stephen Clark,et al.  A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing , 2008, EMNLP.

[47]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[48]  Richard M. Schwartz,et al.  BBN System Description for WMT10 System Combination Task , 2010, WMT@ACL.

[49]  Jianfeng Gao,et al.  Learning Continuous Phrase Representations for Translation Modeling , 2014, ACL.

[50]  Yoav Goldberg,et al.  Efficient Implementation of Beam-Search Incremental Parsers , 2013, ACL.

[51]  Michael Collins,et al.  Discriminative Reranking for Natural Language Parsing , 2000, CL.

[52]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[53]  David A. Smith,et al.  Minimum Risk Annealing for Training Log-Linear Models , 2006, ACL.

[54]  Taro Watanabe,et al.  Transition-based Neural Constituent Parsing , 2015, ACL.

[55]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.