Efficient XQuery Support for Stand-Off Annotation

textabstractXML annotations are a widely occurring phenomenon in many application fields, and XML databases should be used to store and query such data. To provide intuitive and fast querying of annotations, we make a case for extending XPath with four new axis steps, that correspond with socalled StandOff joins, introduced here. The new steps can be efficiently implemented using a region index and fast looplifted StandOff MergeJoin algorithms. These techniques were added to the open-source XML DBMS MonetDB/XQuery, and we show in our evaluation it thus becomes capable of interactively querying >GB annotation databases.

[1]  Torsten Grust,et al.  Staircase Join: Teach a Relational DBMS to Watch its (Axis) Steps , 2003, VLDB.

[2]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[3]  Torsten Grust,et al.  MonetDB/XQuery: a fast XQuery processor powered by a relational engine , 2006, SIGMOD Conference.

[4]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[5]  C. M. Sperberg-McQueen,et al.  GODDAG: A Data Structure for Overlapping Hierarchies , 2000, DDEP/PODDP.

[6]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[7]  Sherif Sakr,et al.  XQuery on SQL Hosts , 2004, VLDB.

[8]  Alex Dekhtyar,et al.  Towards a Query Language for Multihierarchical XML: Revisiting XPath , 2005, WebDB.

[9]  W. Alink XIRAF: an XML Information Retrieval Approach to Digital Forensics , 2005 .

[10]  David McKelvie,et al.  Hyperlink semantics for standoff markup of read-only documents , 1997 .

[11]  Forbes J. Burkowski Retrieval activities in a database consisting of heterogeneous collections of structured text , 1992, SIGIR '92.

[12]  C. M. Sperberg-McQueen,et al.  Guidelines for electronic text encoding and interchange , 1994 .

[13]  Christian S. Jensen,et al.  Join operations in temporal databases , 2005, The VLDB Journal.

[14]  Graham Wilcock,et al.  Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing (NLPXML-2006) , 2006 .