- based Named Entity Disambiguation to Arbitrary Web Text

This paper investigates the “named-entity disambiguation” task on the Web—identifying the referent of a string, found on an arbitrary Web page. The GROUNDER system, introduced in this paper, addresses two challenges not considered by previous work: how to utilize a priori information (e.g., Bill Clinton is more prominent on the Web than Clinton County) to improve disambiguation, and how to compose this prior information with contextual evidence. GROUNDER addresses both challenges by leveraging the user-contributed knowledge in Wikipedia and providing a novel formulation of the task. On a sample of strings drawn from the Web, GROUNDER achieves precision of 1.0 at recall 0.34, and precision 0.90 at recall 0.60.