Evaluating semantic relatedness through categorical and contextual information for entity disambiguation

The number of entities in large-scale knowledge bases has been growing in recent years. The key issue to entity linking using a knowledge base such as Wikipedia is entity disambiguation. The objective of our proposing system is to disambiguate entities in documents and link entity mentions to their corresponding Wikipedia articles. To this end, our system ranks the set of candidate entities based on relatedness by utilizing semantic features derived from Wikipedia category hierarchies and articles. In addition, to reflect contextual information of Wikipedia, we utilize word embedding for refining the ranking result of candidate entities. Our experimental results show that these features have given good correlation with human rankings in candidate relatedness ranking and the combination of features has high disambiguation accuracy on news articles.