Measuring Semantic Similarity between Words Using Web Search Engines
暂无分享,去创建一个
Semantic similarity measures play important roles in many Web-related tasks such as Web browsing and query suggestion.Because taxonomy-based methods cannot deal with continually emerging words,recently Web-based methods have been proposed to solve this problem.Because of the noise and redundancy hidden in the Web data,robustness and accuracy are still challenges.We proposed a method integrating page counts and snippets returned by Web search engines.Then,the semantic snippets and the number of search results were used to remove noise and redundancy in the Web snippets.After that,a method integrating page counts,semantics snippets and the number of already displayed search results was proposed.The proposed method does not need any human annotated knowledge,and can be applied Web-related tasks easily.A correlation coefficient of 0.851 against Rubenstein-Goodenough benchmark dataset shows that the proposed method outperforms the existing Web-based methods by a wide margin.Moreover,the proposed semantic similarity measure significantly improves the quality of query suggestion against some page counts based methods.