Hyponymy Graph Model for Word Semantic Similarity Measurement

Measuring word semantic similarity is a generic problem with a broad range of applications such as ontology mapping, computational linguistics and artificial intelligence. Previous approaches to computing word semantic similarity did not consider concept occurrence frequency and word’s sense number. This paper introduced Hyponymy graph, and based on which proposed a novel word semantic similarity model. For two words to be compared, we first retrieve their related concepts; then produce lowest common ancestor matrix and distance matrix between concepts; finally calculate distance-based similarity and information-based similarity, which are integrated to get final semantic similarity. The main contribution of our method is that both concept occurrence frequency and word’s sense number are taken into account. This similarity measurement more closely fits with human rating and effectively simulates human thinking process. Our experimental results on benchmark dataset M&C and R&G with WordNet2.1 as platform demonstrate roughly 0.9%–1.2% improvements over existing best approaches.