Measuring Similarity from Word Pair Matrices with Syntagmatic and Paradigmatic Associations

Two types of semantic similarity are usually distinguished: attributional and relational similarities. These similarities measure the degree between words or word pairs. Attributional similarities are bidrectional, while relational similarities are one-directional. It is possible to compute such similarities based on the occurrences of words in actual sentences. Inside sentences, syntagmatic associations and paradigmatic associations can be used to characterize the relations between words or word pairs. In this paper, we propose a vector space model built from syntagmatic and paradigmatic associations to measure relational similarity between word pairs from the sentences contained in a small corpus. We conduct two experiments with different datasets: SemEval-2012 task 2, and 400 word analogy quizzes. The experimental results show that our proposed method is effective when using a small corpus.

[1]  Michael N Jones,et al.  Representing word meaning and order information in a composite holographic lexicon. , 2007, Psychological review.

[2]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[3]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[4]  Peter D. Turney Measuring Semantic Similarity by Latent Relational Analysis , 2005, IJCAI.

[5]  George A. Miller WordNet: A Lexical Database for English , 1992, HLT.

[6]  Danushka Bollegala,et al.  Measuring the Degree of Synonymy between Words Using Relational Similarity between Word Pairs as a Proxy , 2012, IEICE Trans. Inf. Syst..

[7]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[8]  Dedre Gentner,et al.  Structure-Mapping: A Theoretical Framework for Analogy , 1983, Cogn. Sci..

[9]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[10]  Sanda M. Harabagiu,et al.  UTD: Determining Relational Similarity Using Lexical Patterns , 2012, *SEMEVAL.

[11]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[12]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  Peter D. Turney The Latent Relation Mapping Engine: Algorithm and Experiments , 2008, J. Artif. Intell. Res..

[15]  Geoffrey Zweig,et al.  Combining Heterogeneous Models for Measuring Relational Similarity , 2013, NAACL.

[16]  Michael L. Littman,et al.  Corpus-based Learning of Analogies and Semantic Relations , 2005, Machine Learning.

[17]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[18]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[19]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[20]  Curt Burgess,et al.  Producing high-dimensional semantic spaces from lexical co-occurrence , 1996 .

[21]  Saif Mohammad,et al.  SemEval-2012 Task 2: Measuring Degrees of Relational Similarity , 2012, *SEMEVAL.

[22]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[23]  Peter D. Turney Distributional Semantics Beyond Words: Supervised Learning of Analogy and Paraphrase , 2013, TACL.

[24]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[25]  Zellig S. Harris,et al.  Distributional Structure , 1954 .