CCG Supertagging with a Recurrent Neural Network

Recent work on supertagging using a feedforward neural network achieved significant improvements for CCG supertagging and parsing (Lewis and Steedman, 2014). However, their architecture is limited to considering local contexts and does not naturally model sequences of arbitrary length. In this paper, we show how directly capturing sequence information using a recurrent neural network leads to further accuracy improvements for both supertagging (up to 1.9%) and parsing (up to 1% F1), on CCGBank, Wikipedia and biomedical text.

[1]  Johan Bos,et al.  Linguistically Motivated Large-Scale NLP with C&C and Boxer , 2007, ACL.

[2]  Vysoké Učení,et al.  Statistical Language Models Based on Neural Networks , 2012 .

[3]  Jari Björne,et al.  BioInfer: a corpus for information extraction in the biomedical domain , 2007, BMC Bioinformatics.

[4]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[5]  Adam Lopez,et al.  Training a Log-Linear Parser with Loss Functions via Softmax-Margin , 2011, EMNLP.

[6]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[7]  Ronan Collobert,et al.  Joint RNN-Based Greedy Parsing and Word Composition , 2014, ICLR.

[8]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[9]  Joel Nothman,et al.  Evaluating a Statistical CCG Parser on Wikipedia , 2009, PWNLP@IJCNLP.

[10]  Stephen Clark,et al.  Shift-Reduce CCG Parsing , 2011, ACL.

[11]  James R. Curran,et al.  Multi-Tagging for Lexicalized-Grammar Parsing , 2006, ACL.

[12]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[13]  Stephen Clark,et al.  Shift-Reduce CCG Parsing with a Dependency Model , 2014, ACL.

[14]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[15]  James R. Curran,et al.  Faster Parsing by Supertagger Adaptation , 2010, ACL.

[16]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[17]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[18]  Mark Steedman,et al.  Improved CCG Parsing with Semi-supervised Supertagging , 2014, TACL.

[19]  Stephen Clark,et al.  Adapting a Lexicalized-Grammar Parser to Contrasting Domains , 2008, EMNLP.

[20]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[21]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[22]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.