论文信息 - Focused Access to Wikipedia

Focused Access to Wikipedia

Wikipedia is a “free” online encyclopedia. It contains millions of entries in many languages and is growing at a fast pace. Due to its volume, search engines play an important role in giving access to the information in Wikipedia. The “free” availability of the collection makes it an attractive corpus for information retrieval experiments. In this paper we describe the evaluation of a search engine that provides focused search access to Wikipedia, i.e., a search engine which gives direct access to individual sections of Wikipedia pages. The main contributions of this paper are twofold. First, we introduce Wikipedia as a test corpus for information retrieval experiments in general and for semi-structured retrieval in particular. Second, we demonstrate that focused XML retrieval methods can be applied to a wider range of problems than searching scientific journals in XML format, including accessing reference works.

Maarten de Rijke | Jaap Kamps

[1] Peter Ingwersen,et al. The development of a method for the evaluation of interactive information retrieval systems , 1997, J. Documentation.

[2] Mounia Lalmas,et al. Best entry points for structured document retrieval - Part I: Characteristics , 2006, Inf. Process. Manag..

[3] M. de Rijke,et al. Mixture Models, Overlap, and Structural Hints in XML Element Retrieval , 2004, INEX.

[4] Birger Larsen,et al. The Interactive Track at INEX 2005 , 2005, INEX.

[5] Birger Larsen,et al. The Interactive Track at INEX 2004 , 2004, INEX.

[6] M. de Rijke,et al. An Element-based Approach to XML Retrieval , 2004 .

[8] Jaap Kamps,et al. What Do Users Think of an XML Element Retrieval System? , 2005, INEX.

[9] Yves Chiaramella,et al. Browsing and Querying: Two Complementary Approaches for Multimedia Information Retrieval , 1997, Hypertext, Information Retrieval, Multimedia.

[10] Weblog Wikipedia,et al. In Wikipedia the Free Encyclopedia , 2005 .