Effective Use of Semantic Structure in XML Retrieval

The objective of XML retrieval is to return relevant XML document fragments that answer a given user information need, by exploiting the document structure. The focus in this article is on automatically deriving and using semantic XML structure to enhance the retrieval performance of XML retrieval systems. Based on a naive approach for named entity detection, we discuss how the structure of an XML document can be enriched using the Reuters 21587 news collection. Based on a retrieval performance experiment, we study the effect of the additional semantic structure on the retrieval performance of our XSee search engine for XML documents. The experiment provides some initial evidence that an XML retrieval system significantly benefits from having meaningful XML structure.

[1]  Stanley Y. W. Su,et al.  Web Information Systems – WISE 2004 , 2004, Lecture Notes in Computer Science.

[2]  Shlomo Geva GPX - Gardens Point XML Information Retrieval at INEX 2004 , 2004, INEX.

[3]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[4]  Roelof van Zwol B3-SDR and Effective Use of Structural Hints , 2005, INEX.

[5]  Mounia Lalmas,et al.  Advances in XML Information Retrieval, Third International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2004, Dagstuhl Castle, Germany, December 6-8, 2004, Revised Selected Papers , 2005, INEX.

[6]  Wouter Weerkamp,et al.  XSee: Structure Xposed , 2006, INEX.

[7]  Thijs Westerveld,et al.  Using small XML elements to support relevance , 2006, SIGIR '06.

[8]  Andrew Trotman,et al.  Narrowed Extended XPath I (NEXI) , 2004, INEX.

[9]  Andrew Trotman,et al.  Why structural hints in queries do not help XML-retrieval , 2006, SIGIR.

[10]  Gabriella Kazai,et al.  Advances in XML Information Retrieval and Evaluation, 4th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2005, Dagstuhl Castle, Germany, November 28-30, 2005, Revised Selected Papers , 2006, INEX.

[11]  Andrew Trotman,et al.  Comparative Evaluation of XML Information Retrieval Systems: 5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006 Dagstuhl Castle, Germany, December 17-20, 2006 Revised and Selected Papers , 2005 .