Entity Retrieval

Generalizing recent attention to retrieving entities and not just documents, we introduce two entity retrieval tasks: list completion and entity ranking. For each task, we propose and evaluate several algorithms. One of the core challenges is to overcome the very limited amount of information that serves as input—to address this challenge we explore different representations of list descriptions and/or example entities, where entities are represented not just by a textual description but also by the description of related entities. For evaluation purposes we make use of the lists and categories available in Wikipedia. Experimental results show that cluster-based contexts improve retrieval results for both tasks.

[1]  W. Bruce Croft,et al.  Hierarchical Language Models for Expert Finding in Enterprise Corpora , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[2]  C. J. van Rijsbergen,et al.  The use of hierarchic clustering in information retrieval , 1971, Inf. Storage Retr..

[3]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[4]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[5]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[6]  David Hawking,et al.  Panoptic Expert: Searching for experts not just for documents , 2001 .

[7]  Malvina Nissim,et al.  Question Answering with QED at TREC 2005 , 2005, TREC.

[8]  M. de Rijke,et al.  Estimating Importance Features for Fact Mining (With a Case Study in Biography Mining) , 2007, RIAO.

[9]  Jay Ponte,et al.  LANGUAGE MODELS FOR RELEVANCE FEEDBACK , 2002 .

[10]  V. Zlatic,et al.  Wikipedias: collaborative web-based encyclopedias as complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[12]  Leif Azzopardi,et al.  Incorporating context within the language modeling approach for ad hoc information retrieval , 2006, SIGF.

[13]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[14]  Maarten de Rijke,et al.  Finding experts and their eetails in e-mail corpora , 2006, WWW '06.

[15]  W. Bruce Croft,et al.  Relevance Models in Information Retrieval , 2003 .

[16]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[17]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[18]  Valentin Jijkoun,et al.  WiQA: Evaluating Multi-lingual Focused Access to Wikipedia , 2007, EVIA@NTCIR.

[19]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[20]  Ludovic Denoyer,et al.  The Wikipedia XML Corpus , 2006, INEX.

[21]  Katherine A. Heller,et al.  Bayesian Sets , 2005, NIPS.

[22]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.