Exemplar-Based Word-Space Model for Compositionality Detection: Shared Task System Description

In this paper, we highlight the problems of polysemy in word space models of compositionality detection. Most models represent each word as a single prototype-based vector without addressing polysemy. We propose an exemplar-based model which is designed to handle polysemy. This model is tested for compositionality detection and it is found to outperform existing prototype-based models. We have participated in the shared task (Biemann and Giesbrecht, 2011) and our best performing exemplar-model is ranked first in two types of evaluations and second in two other evaluations.

[1]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[2]  Katrin Erk,et al.  Exemplar-Based Models for Word Meaning in Context , 2010, ACL.

[3]  Raymond J. Mooney,et al.  Multi-Prototype Vector-Space Models of Word Meaning , 2010, NAACL.

[4]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[5]  Barbara H. Partee,et al.  Lexical semantics and compositionality. , 1995 .

[6]  John Carroll,et al.  Detecting a Continuum of Compositionality in Phrasal Verbs , 2003, ACL 2003.

[7]  Emiliano Raúl Guevara,et al.  Computing Semantic Compositionality in Distributional Semantics , 2011, IWCS.

[8]  James Richard Curran,et al.  From distributional to semantic similarity , 2004 .

[9]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[10]  Timothy Baldwin,et al.  An Empirical Model of Multiword Expression Decomposability , 2003, ACL 2003.

[11]  Christian Biemann,et al.  Distributional Semantics and Compositionality 2011: Shared Task Description and Results , 2011 .

[12]  G. Murphy,et al.  The Big Book of Concepts , 2002 .

[13]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[14]  Eugenie Giesbrecht In Search of Semantic Compositionality in Vector Spaces , 2009, ICCS.

[15]  Eugenie Giesbrecht,et al.  Automatic Identification of Non-Compositional Multi-Word Expressions using Latent Semantic Analysis , 2006 .

[16]  Timothy Baldwin,et al.  A Statistical Approach to the Semantics of Verb-Particles , 2003, ACL 2003.

[17]  Adam Kilgarriff,et al.  The Sketch Engine , 2004 .

[18]  Daniel Jurafsky,et al.  Is Knowledge-Free Induction of Multiword Unit Dictionary Headwords a Solved Problem? , 2001, EMNLP.

[19]  Adam Kilgarriff,et al.  An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments) , 2007, ACL.

[20]  Silvia Bernardini,et al.  Introducing and evaluating ukWaC , a very large web-derived corpus of English , 2008 .