论文信息 - Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms

Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms

Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we use three vector-based models to retrieve semantically related words for a set of Dutch nouns and we analyse whether three linguistic properties of the nouns influence the results. In particular, we compare results from a dependency-based model with those from a 1st and 2nd order bag-of-words model and we examine the effect of the nouns frequency, semantic speficity and semantic class. We find that all three models find more synonyms for high-frequency nouns and those belonging to abstract semantic classses. Semantic specificty does not have a clear influence.

Yves Peirsman | Kris Heylen | Dirk Speelman | Dirk Geeraerts

[1] Mirella Lapata,et al. Dependency-Based Construction of Semantic Space Models , 2007, CL.

[2] T. Van de Cruys,et al. The Application of Singular Value Decomposition to Dutch Noun-Adjective Matrices , 2006, JEPTALNRECITAL.

[3] Dekang Lin,et al. Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[4] Piek Vossen,et al. EuroWordNet: A multilingual database with lexical semantic networks , 1998, Springer Netherlands.

[5] Hinrich Schütze,et al. Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[6] Joseph P. Levy,et al. Learning Lexical Properties from Word Usage Patterns: Which Context Words Should be Used? , 2000, NCPW.

[7] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.

[8] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[9] Gertjan van Noord,et al. At Last Parsing Is Now Operational , 2006, JEPTALNRECITAL.

[10] D. Speelman,et al. Putting things in order. First and second order context models for the calculation of semantic similarity. , 2008 .

[11] Yves Peirsman,et al. Finding semantically related words in Dutch: co-occurrences versus syntactic contexts , 2007 .

[12] Lonneke van der Plas,et al. Syntactic Contexts for Finding Semantically Related Words , 2004, CLIN.