XPath Extension for Querying Concurrent XML Markup

XPath is a language for addressing parts of an XML document. It is used in many XML query languages and it can be used by itself for querying XML documents. While XPath is, in general, efficient for querying individual XML documents, it lacks the features for querying over collections of documents or joining parts of the same document. As the amount of complex document-centric XML data is continually increasing, querying such documents has drawn surprisingly little attention. We propose an XPath axes extension to deal with querying collections of document-centric XML documents sharing the same content (called concurrent XML). The algorithms we propose to evaluate the extended axes work in linear time combined complexity (number of documents and total size of documents).

[1]  Steven J. DeRose,et al.  Xml pointer language (xpointer) version 1 , 2001 .

[2]  Jerzy W. Jaromczyk,et al.  The ARCHway Project: Architecture for Research in Computing for Humanities through Research, Teaching, and Learning , 2005 .

[3]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Torsten. Grust,et al.  Accelerating XPath location steps , 2002, SIGMOD '02.

[5]  Pedro M. Domingos,et al.  Learning Source Description for Data Integration , 2000, WebDB.

[6]  Alon Y. Levy Logic-based techniques in data integration , 2001 .

[7]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[8]  Georg Gottlob,et al.  The complexity of XPath query evaluation , 2003, PODS.

[9]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[10]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[11]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.

[12]  Daniela Florescu,et al.  Storing and Querying XML Data using an RDMBS , 1999, IEEE Data Eng. Bull..

[13]  P. Wadler Two semantics for XPath , 2000 .

[14]  David G. Durand,et al.  Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies , 1993 .

[15]  Alex Dekhtyar,et al.  A framework for management of concurrent XML markup , 2005, Data Knowl. Eng..

[16]  Steven J. DeRose,et al.  XML Path Language (XPath) Version 1.0 , 1999 .

[17]  Ioana Manolescu,et al.  Answering XML Queries on Heterogeneous Data Sources , 2001, VLDB.

[18]  Vishu Krishnamurthy,et al.  Performance Challenges in Object-Relational DBMSs , 1999, IEEE Data Eng. Bull..

[19]  James Clark,et al.  XSL Transformations (XSLT) Version 1.0 , 1999 .

[20]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[21]  C. M. Sperberg-McQueen,et al.  GODDAG: A Data Structure for Overlapping Hierarchies , 2000, DDEP/PODDP.

[22]  Andreas Witt Meaning and interpretation of concurrent markup , 2002 .

[23]  Georg Gottlob,et al.  XPath query evaluation: improving time and space efficiency , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).