Context Based Wikipedia Linking

Automatically linking Wikipedia pages can be done either content based by exploiting word similarities or structure based by exploiting characteristics of the link graph. Our approach focuses on a content based strategy by detecting Wikipedia titles as link candidates and selecting the most relevant ones as links. The relevance calculation is based on the context, i.e. the surrounding text of a link candidate. Our goal was to evaluate the influence of the link-context on selecting relevant links and determining a links best-entry-point. Results show, that a whole Wikipedia page provides the best context for resolving link and that straight forward inverse document frequency based scoring of anchor texts achieves around 4% less Mean Average Precision on the provided data set.