Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space

We propose an approach to adjective-noun composition (AN) for corpus-based distributional semantics that, building on insights from theoretical linguistics, represents nouns as vectors and adjectives as data-induced (linear) functions (encoded as matrices) over nominal vectors. Our model significantly outperforms the rivals on the task of reconstructing AN vectors not seen in training. A small post-hoc analysis further suggests that, when the model-generated AN vector is not similar to the corpus-observed AN vector, this is due to anomalies in the latter. We show moreover that our approach provides two novel ways to represent adjective meanings, alternative to its representation via corpus-based co-occurrence vectors, both outperforming the latter in an adjective clustering task.

[1]  G. Frege Über Sinn und Bedeutung , 1892 .

[2]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[3]  R. Montague Formal philosophy; selected papers of Richard Montague , 1974 .

[4]  Richard Montague,et al.  ENGLISH AS A FORMAL LANGUAGE , 1975 .

[5]  H. Kamp Two theories about adjectives , 2013 .

[6]  Muffy Emily Ann Siegel,et al.  Capturing the adjective , 1976 .

[7]  László Dezsö,et al.  Universal Grammar , 1981, Certainty in Action.

[8]  Paul Smolensky,et al.  Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1990, Artif. Intell..

[9]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[10]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[11]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[12]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[13]  Hang Li,et al.  Review of Ambiguity resolution in language learning: computational and cognitive models by Hinrich Schütze. CSLI Publications 1997. , 1999 .

[14]  Walter Kintsch,et al.  Predication , 2001, Cogn. Sci..

[15]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[16]  G. Karypis,et al.  Criterion Functions for Document Clustering ∗ Experiments and Analysis , 2001 .

[17]  George Karypis,et al.  CLUTO - A Clustering Toolkit , 2002 .

[18]  R. Rapp Word sense discovery based on sense descriptor dissimilarity , 2003, MTSUMMIT.

[19]  Barbara H. Partee,et al.  Compositionality in Formal Semantics , 2004 .

[20]  Ron Wehrens,et al.  The pls Package: Principal Component and Partial Least Squares Regression in R , 2007 .

[21]  Stephen Clark,et al.  Combining Symbolic and Distributional Models of Meaning , 2007, AAAI Spring Symposium: Quantum Interaction.

[22]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[23]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[24]  P. Kanerva,et al.  Permutations as a means to encode order in word space , 2008 .

[25]  Dominic Widdows,et al.  Semantic Vector Products: Some Initial Investigations , 2008 .

[26]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[27]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[28]  Katrin Erk,et al.  Paraphrase Assessment in Structured Vector Space: Exploring Parameters and Datasets , 2009 .

[29]  Mirella Lapata,et al.  Language Models Based on Semantic Composition , 2009, EMNLP.

[30]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[31]  E. Guevara A Regression Model of Adjective-Noun Compositionality in Distributional Semantics , 2010 .

[32]  Sebastian Rudolph,et al.  Compositional Matrix-Space Models of Language , 2010, ACL.