论文信息 - Large-Scale Named Entity Disambiguation Based on Wikipedia Data

Large-Scale Named Entity Disambiguation Based on Wikipedia Data

This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles.

Silviu Cucerzan | Silviu Cucerzan

[1] Gerald Salton,et al. Automatic text processing , 1988 .

[2] David Yarowsky,et al. One Sense Per Discourse , 1992, HLT.

[3] Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[4] Allison Woodruff,et al. GIPSY: automated geographic indexing of text documents , 1994 .

[5] Ralph Grishman,et al. Message Understanding Conference- 6: A Brief History , 1996, COLING.

[6] Nina Wacholder,et al. Disambiguation of Proper Names in Text , 1997, ANLP.

[7] Zunaid Kazi,et al. Is Hillary Rodham Clinton the President? Disambiguating Names across Documents , 1999, COREF@ACL.

[8] Yasusi Kanada. A method of geographical name extraction from Japanese text for thematic geographical search , 1999, CIKM '99.

[9] Brian Roark,et al. Noun-phrase co-occurrence statistics for semi-automatic semantic lexicon construction , 2000, COLING.

[10] Adam Kilgarriff,et al. Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[11] Gregory R. Crane,et al. Disambiguating Geographic Names in a Historical Digital Library , 2001, ECDL.