From Global to Local Similarities: A Graph-Based Contextualization Method using Distributional Thesauri

After recasting the computation of a distributional thesaurus in a graph-based framework for term similarity, we introduce a new contextualization method that generates, for each term occurrence in a text, a ranked list of terms that are semantically similar and compatible with the given context. The framework is instantiated by the definition of term and context, which we derive from dependency parses in this work. Evaluating our approach on a standard data set for lexical substitution, we show substantial improvements over a strong non-contextualized baseline across all parts of speech. In contrast to comparable approaches, our framework defines an unsupervised generative method for similarity in context and does not rely on the existence of lexical resources as a source for candidate expansions.

[1]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[2]  Zellig S. Harris,et al.  Methods in structural linguistics. , 1952 .

[3]  Daumé,et al.  Sketch Techniques for Scaling Distributional Similarity to the Web , 2010 .

[4]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[5]  Stefan Evert,et al.  The Statistics of Word Cooccur-rences: Word Pairs and Collocations , 2004 .

[6]  Mirella Lapata,et al.  Vector-based Models of Semantic Composition , 2008, ACL.

[7]  Lei Zheng,et al.  A Scalable Distributed Syntactic, Semantic, and Lexical Language Model , 2012, CL.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[10]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[11]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[12]  Hinrich Schütze,et al.  Word Space , 1992, NIPS.

[13]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[14]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[15]  Christian Biemann,et al.  Text: now in 2D! A framework for lexical expansion with contextual similarity , 2013, J. Lang. Model..

[16]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[17]  Thierry Poibeau,et al.  A Tensor-based Factorization Model of Semantic Compositionality , 2013, NAACL.

[18]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[19]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[20]  Walter Kintsch,et al.  Predication , 2001, Cogn. Sci..

[21]  Alessandro Lenci,et al.  Unsupervised Lexical Substitution with a Word Space Model , 2009 .

[22]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[23]  Z. Harris,et al.  Methods in structural linguistics. , 1952 .

[24]  Adam Kilgarriff,et al.  An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments) , 2007, ACL.

[25]  David M. Blei,et al.  Syntactic Topic Models , 2008, NIPS.

[26]  Chris Biemann,et al.  Exploiting the Leipzig Corpora Collection , 2006 .

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Roberto Navigli,et al.  The English lexical substitution task , 2009, Lang. Resour. Evaluation.