Modelling Vague Content and Structure Querying in XML Retrieval with a Probabilistic Object-Relational Framework

Many XML retrieval applications require relevance-oriented ranking of retrieved elements in order to capture the vagueness inherent to the information retrieval process. This relevance-oriented ranking should not only support vagueness at the content level, but also at the structural level. In this paper, we use a probabilistic object-relational framework to model representation and retrieval strategies that take into account vagueness at both content and structure level. Our approach makes use of established database technology combined with sound probability theory, thus allowing for fast and flexible prototyping of various representation and retrieval strategies.

[1]  Donald D. Chamberlin XQuery: An XML query language , 2002, IBM Syst. J..

[2]  Thomas Roelleke POOL: probabilistic object oriented logical representation and retrieval of complex objects: a model for hypermedia retrieval , 1999 .

[3]  Thomas Roelleke A frequency-based and a poisson-based definition of the probability of being informative , 2003, SIGIR '03.

[4]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[5]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[6]  Matthias Jarke,et al.  Advances in Database Technology — EDBT 2002 , 2002, Lecture Notes in Computer Science.

[7]  Norbert Fuhr,et al.  XIRQL: a query language for information retrieval in XML documents , 2001, SIGIR '01.

[8]  Janusz Kacprzyk,et al.  Intelligent Exploration of the Web , 2003, Studies in Fuzziness and Soft Computing.

[9]  David Carmel,et al.  Searching XML documents via XML fragments , 2003, SIGIR.

[10]  Gabriella Kazai,et al.  A report on the first year of the INitiative for the Evaluation of XML retrieval , 2003, J. Assoc. Inf. Sci. Technol..

[11]  Gerhard Weikum,et al.  Intelligent Search on XML Data: Applications, Languages, Models, Implementations, and Benchmarks , 2003 .

[12]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[13]  Gabriella Kazai,et al.  The Accessibility Dimension for Structured Document Retrieval , 2002, ECIR.

[14]  Sihem Amer-Yahia,et al.  Tree Pattern Relaxation , 2002, EDBT.

[15]  Gerhard Weikum,et al.  The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking , 2002, EDBT.

[16]  Guido Moerkotte,et al.  Querying documents in object databases , 1997, International Journal on Digital Libraries.

[17]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[18]  Torsten Schlieder,et al.  Result Ranking for Structured Queries against XML Documents , 2000, DELOS.