A Fast Unified Model for Parsing and Sentence Understanding

Tree-structured neural networks exploit valuable syntactic parse information as they interpret the meanings of sentences. However, they suer from two key technical problems that make them slow and unwieldyforlarge-scaleNLPtasks: theyusually operate on parsed sentences and they do not directly support batched computation. We address these issues by introducingtheStack-augmentedParser-Interpreter NeuralNetwork(SPINN),whichcombines parsing and interpretation within a single tree-sequence hybrid model by integrating tree-structured sentence interpretation into the linear sequential structure of a shiftreduceparser. Ourmodelsupportsbatched computation for a speedup of up to 25◊ over other tree-structured models, and its integrated parser can operate on unparsed data with little loss in accuracy. We evaluate it on the Stanford NLI entailment task and show that it significantly outperforms other sentence-encoding models.

[1]  Alfred V. Aho,et al.  The Theory of Parsing, Translation, and Compiling , 1972 .

[2]  Stuart M. Shieber,et al.  Sentence Disambiguation by a Shift-Reduce Parsing Technique , 1983, ACL.

[3]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Gérard P. Huet,et al.  The Zipper , 1997, Journal of Functional Programming.

[6]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[7]  James Henderson,et al.  Discriminative Training of a Neural Network Statistical Parser , 2004, ACL.

[8]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[9]  Ahmad Emami,et al.  A Neural Syntactic Language Model , 2005, Machine Learning.

[10]  David R. Dowty Compositionality as an Empirical Problem , 2006 .

[11]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[12]  Ivan Titov,et al.  A Latent Variable Model for Generative Dependency Parsing , 2007, Trends in Parsing Technology.

[13]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[14]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[15]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[16]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[17]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18]  Claire Cardie,et al.  Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[19]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[20]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  Christopher Potts,et al.  Tree-Structured Composition in Neural Networks without Tree-Structured Architectures , 2015, CoCo@NIPS.

[23]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[24]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[25]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[26]  Rui Yan,et al.  Recognizing Entailment and Contradiction by Tree-based Convolution , 2015, ArXiv.

[27]  Noah A. Smith,et al.  Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.

[28]  Tomas Mikolov,et al.  Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.

[29]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Phil Blunsom,et al.  Generative Incremental Dependency Parsing with Neural Networks , 2015, ACL.

[31]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[32]  Eduard H. Hovy,et al.  When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.

[33]  Phil Blunsom,et al.  Learning to Transduce with Unbounded Memory , 2015, NIPS.

[34]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[35]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[36]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.

[37]  Phil Blunsom,et al.  Reasoning about Entailment with Neural Attention , 2015, ICLR.

[38]  Sanja Fidler,et al.  Order-Embeddings of Images and Language , 2015, ICLR.

[39]  Shuohang Wang,et al.  Learning Natural Language Inference with LSTM , 2015, NAACL.

[40]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[41]  Noah A. Smith,et al.  Recurrent Neural Network Grammars , 2016, NAACL.

[42]  Eliyahu Kiperwasser,et al.  Easy-First Dependency Parsing with Hierarchical Tree LSTMs , 2016, TACL.

[43]  Liang Lu,et al.  Top-down Tree Long Short-Term Memory Networks , 2015, NAACL.

[44]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.