论文信息 - GPX - Gardens Point XML Information Retrieval at INEX 2004

GPX - Gardens Point XML Information Retrieval at INEX 2004

Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.

Shlomo Geva

[1] Andrew Trotman,et al. Narrowed Extended XPath I (NEXI) , 2004, INEX.

[2] M. de Rijke,et al. An Element-based Approach to XML Retrieval , 2004 .

[3] Andrew Trotman,et al. The Simplest Query Language That Could Possibly Work , 2004 .