Research on measuring semantic correlation based on the Wikipedia hyperlink network

As a free online encyclopedia with a large-scale of knowledge coverage, rich semantic information and quick update speed, Wikipedia brings new ideas to measure semantic correlation. In this paper, we present a new method for measuring the semantic correlation between words by mining rich semantic information that exists in Wikipedia. Unlike the previous methods that calculate semantic relatedness merely based on the page network or the category network, our method not only takes into account the semantic information of the page network, also combines the semantic information of the category network, and it improve the accuracy of the results. Besides, we analyze and evaluate the algorithm by comparing the calculation results with famous knowledge base (e.g., Hownet) and traditional methods based on Wikipedia on the same test set, and prove its superiority.