A Comparison of Vector-based Representations for Semantic Composition

In this paper we address the problem of modeling compositional meaning for phrases and sentences using distributional methods. We experiment with several possible combinations of representation and composition, exhibiting varying degrees of sophistication. Some are shallow while others operate over syntactic structure, rely on parameter learning, or require access to very large corpora. We find that shallow approaches are as good as more computationally intensive alternatives with regards to two particular tests: (1) phrase similarity and (2) paraphrase detection. The sizes of the involved training corpora and the generated vectors are not as important as the fit between the meaning representation and compositional method.

[1]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[2]  Aminul Islam,et al.  Semantic similarity of short texts , 2009 .

[3]  Stephen Wan,et al.  Using Dependency-Based Features to Take the ’Para-farce’ out of Paraphrase , 2006, ALTA.

[4]  Marco Baroni,et al.  Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space , 2010, EMNLP.

[5]  Jeffrey Pennington,et al.  Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[6]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[7]  Gregory Grefenstette,et al.  Explorations in automatic thesaurus discovery , 1994 .

[8]  G. Denhière,et al.  A Computational Model of Children's Semantic Memory , 2004 .

[9]  Peter Bruza,et al.  Quantum Logic of Semantic Space: An Exploratory Investigation of Context Effects in Practical Reasoning , 2005, We Will Show Them!.

[10]  Daoud Clarke,et al.  A Context-Theoretic Framework for Compositionality in Distributional Semantics , 2011, Computational Linguistics.

[11]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[12]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[13]  Chris Quirk,et al.  Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources , 2004, COLING.

[14]  Walter Kintsch,et al.  Predication , 2001, Cogn. Sci..

[15]  Mehrnoosh Sadrzadeh,et al.  Experimenting with transitive verbs in a DisCoCat , 2011, GEMS.

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Tat-Seng Chua,et al.  Paraphrase Recognition via Dissimilarity Significance Classification , 2006, EMNLP.

[18]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[19]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[20]  Mark Steyvers,et al.  Topics in semantic representation. , 2007, Psychological review.

[21]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[22]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[23]  Samuel Fernando,et al.  A Semantic Similarity Approach to Paraphrase Detection , 2008 .

[24]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[25]  Mehrnoosh Sadrzadeh,et al.  Experimental Support for a Categorical Compositional Distributional Model of Meaning , 2011, EMNLP.

[26]  Geoffrey E. Hinton Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .

[27]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[28]  Arthur C. Graesser,et al.  Paraphrase Identification with Lexico-Syntactic Graph Subsumption , 2008, FLAIRS.

[29]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[30]  Scott A. McDonald,et al.  Environmental Determinants of Lexical Processing Effort , 2000 .

[31]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[32]  Yoshua Bengio,et al.  Neural net language models , 2008, Scholarpedia.

[33]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[34]  J.R. Bellegarda,et al.  Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.

[35]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[36]  S. Clark,et al.  A Compositional Distributional Model of Meaning , 2008 .