Use of Wikipedia Categories in Entity Ranking

Wikipedia is a useful source of knowledge that has many applications in language processing and knowledge representation. The Wikipedia category graph can be compared with the class hierarchy in an ontology; it has some characteristics in common as well as some differences. In this paper, we present our approach for answering entity ranking queries from the Wikipedia. In particular, we explore how to make use of Wikipedia categories to improve entity ranking effectiveness. Our experiments show that using categories of example entities works significantly better than using loosely defined target categories.

[1]  Atanas Kiryakov,et al.  Towards Semantic Web Information Extraction , 2003 .

[2]  David Yarowsky,et al.  Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence , 1999, EMNLP.

[3]  James A. Thom,et al.  Entity ranking in Wikipedia , 2007, SAC '08.

[4]  M. de Rijke,et al.  Entity Retrieval , 2007 .

[5]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  James A. Thom,et al.  Ontology evaluation using wikipedia categories for browsing , 2007, CIKM '07.

[8]  Ismailcem Budak Arpinar,et al.  Ontology-Driven Automatic Entity Disambiguation in Unstructured Text , 2006, SEMWEB.

[9]  James A. Thom,et al.  Exploiting Locality of Wikipedia Links in Entity Ranking , 2008, ECIR.

[10]  William W. Cohen,et al.  Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods , 2004, KDD.

[11]  Marc Ehrig,et al.  Similarity for Ontologies - A Comprehensive Framework , 2005, ECIS.

[12]  Ludovic Denoyer,et al.  The Wikipedia XML Corpus , 2006, INEX.

[13]  D. N. F. Awang Iskandar,et al.  Social Media Retrieval Using Image Features and Structured Text , 2006, INEX.

[14]  Xavier Polanco,et al.  Annotation sémantique de pages web , 2006, EGC.

[15]  James Mayfield,et al.  Entity Extraction without Language-Specific Resources , 2002, CoNLL.