论文信息 - The Lifted Matrix-Space Model for Semantic Composition - 字舞流文

The Lifted Matrix-Space Model for Semantic Composition

Recent advances in tree structured sentence encoding models have shown that explicitly modeling syntax can help handle compositionality. More specifically, recent works by \citetext{Socher2012}, \citetext{Socher2013}, and \citetext{Chen2013} have shown that using more powerful composition functions with multiplicative interactions within tree-structured models can yield significant improvements in model performance. However, existing compositional approaches which make use of these multiplicative interactions usually have to learn task-specific matrix-shaped word embeddings or rely on third-order tensors, which can be very costly. This paper introduces the Lifted Matrix-Space model which improves on the predecessors on this aspect. The model learns a global transformation from pre-trained word embeddings into matrices, which can be composed via matrix multiplication. The upshot is that we can capture the multiplicative interaction without learning matrix-valued word representations from scratch. In addition, our composition function effectively transmits a larger number of activations across layers with comparably few model parameters. We evaluate our model on the Stanford NLI corpus and the Multi-Genre NLI corpus and find that the Lifted Matrix-Space model outperforms the tree-structured long short-term memory networks.

Samuel R. Bowman | WooJin Chung

[1] Jian Zhang,et al. Natural Language Inference over Interaction Space , 2017, ICLR.

[2] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[3] Sebastian Rudolph,et al. Compositional Matrix-Space Models of Language , 2010, ACL.

[4] Hong Yu,et al. Neural Semantic Encoders , 2016, EACL.

[5] G. Frege. Über Sinn und Bedeutung , 1892 .

[6] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8] Yorick Wilks,et al. Natural language inference. , 1973 .

[9] Mirella Lapata,et al. Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[10] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[11] Danqi Chen,et al. Learning New Facts From Knowledge Bases With Neural Tensor Networks and Semantic Word Vectors , 2013, ICLR.

[12] Alexandros Potamianos,et al. Structural Attention Neural Networks for improved sentiment analysis , 2017, EACL.

[13] Chris Barker,et al. Continuations and Natural Language , 2014, Oxford Studies in Theoretical Linguistics.

[14] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[15] David J. Weir,et al. Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics , 2016, CL.

[16] Yoshua Bengio,et al. The representational geometry of word meanings acquired by neural machine translation models , 2017, Machine Translation.

[17] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[18] Marco Baroni,et al. Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[19] Ioannis Korkontzelos,et al. Estimating Linear Models for Compositional Distributional Semantics , 2010, COLING.

[20] Christopher D. Manning,et al. Natural language inference , 2009 .

[21] Nicholas Asher,et al. Integrating Type Theory and Distributional Semantics: A Case Study on Adjective–Noun Compositions , 2016, CL.

[22] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[23] Paul D. Elbourne. Situations and individuals , 2005 .

[24] Hongyu Guo,et al. Long Short-Term Memory Over Tree Structures , 2015, ArXiv.

[25] Claire Cardie,et al. Compositional Matrix-Space Models for Sentiment Analysis , 2011, EMNLP.

[26] Andrew Y. Ng,et al. Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[27] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[28] Gennaro Chierchia,et al. Meaning and Grammar: An Introduction to Semantics , 1990 .

[29] Thomas F. Icard III,et al. Recent Progress on Monotonicity , 2014, LILT.

[30] Katrin Erk,et al. A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[31] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .

[32] S. Clark,et al. A Compositional Distributional Model of Meaning , 2008 .

[33] Phong Le,et al. Compositional Distributional Semantics with Long Short Term Memory , 2015, *SEMEVAL.

[34] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[35] Irene Heim,et al. Semantics in generative grammar , 1998 .

[36] David R. Dowty. Compositionality as an Empirical Problem , 2006 .

[37] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[38] Mohit Bansal,et al. Shortcut-Stacked Sentence Encoders for Multi-Domain Inference , 2017, RepEval@EMNLP.

[39] Claire Cardie,et al. Deep Recursive Neural Networks for Compositionality in Language , 2014, NIPS.

[40] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[41] Christopher D. Manning,et al. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[42] Stephen Clark,et al. Concrete Sentence Spaces for Compositional Distributional Models of Meaning , 2010, IWCS.

[43] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[44] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[45] Christopher Potts,et al. A Fast Unified Model for Parsing and Sentence Understanding , 2016, ACL.