Sound Ranking algorithms for XML search

Ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are semantically equal. Ranking algorithms that produce different rankings for queries that are semantically equal are easily detected by tests on large databases: We call such algorithms not sound. We report the behavior of different approaches to ranking content-and-structure queries on pairs of queries for which we expect equal ranking results from the query semantics. We show that most of these approaches are not sound. Of the remaining approaches, only 3 adhere to the W3C XQuery Full-Text standard.

[1]  Vojkan Mihajlovic,et al.  Score region algebra : a flexible framework for structured information retrieval , 2006 .

[2]  Djoerd Hiemstra,et al.  Score region algebra: building a transparent XML-R database , 2005, CIKM '05.

[3]  Cong Yu,et al.  XQuery 1.0 and XPath 2.0 Full-Text , 2009, Encyclopedia of Database Systems.

[4]  Sihem Amer-Yahia,et al.  Texquery: a full-text search extension to xquery , 2004, WWW '04.

[5]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[6]  Djoerd Hiemstra,et al.  DB&IR integration: report on the Dagstuhl seminar "ranked XML querying" , 2008, SIGMOD Rec..

[7]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[8]  Djoerd Hiemstra,et al.  Twenty-One at TREC7: Ad-hoc and Cross-Language Track , 1998, TREC.

[9]  Laks V. S. Lakshmanan,et al.  FleXPath: flexible structure and full-text querying for XML , 2004, SIGMOD '04.

[10]  Mounia Lalmas,et al.  Evaluating XML retrieval effectiveness at INEX , 2007, SIGF.

[11]  Forbes J. Burkowski Retrieval activities in a database consisting of heterogeneous collections of structured text , 1992, SIGIR '92.

[12]  Wessel Kraaij,et al.  Variations on language modeling for information retrieval , 2005, SIGF.

[13]  Mounia Lalmas,et al.  Dempster-Shafer's theory of evidence applied to structured documents: modelling uncertainty , 1997, SIGIR '97.

[14]  David Carmel,et al.  Searching XML documents via XML fragments , 2003, SIGIR.

[15]  Djoerd Hiemstra,et al.  PFTijah: text search in an XML database system , 2006 .

[16]  Hans-Jörg Schek,et al.  Generating Vector Spaces On-the-fly for Flexible XML Retrieval , 2002 .

[17]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[18]  Andrew Trotman,et al.  Narrowed Extended XPath I (NEXI) , 2004, INEX.