A New Design for a Native XML Storage and Indexing Manager

This paper describes the design and implementation of an XML storage manager for fast and interactive XPath expressions evaluation. This storage manager has two main parts: the XML data storage structure and the index over this data. The system is designed in such a way that it minimizes the number of page reads for retrieving any XPath expression results while avoiding the shortcomings of previous work on storing XML data where the index must adapt to the most frequent queries. Hence, the main advantage of our index is that it can handle any new XPath expression without any need for adaptation. We show comparable performance of our design by presenting path evaluation results of our index against those of the currently most known index on documents of different sizes.

[1]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[2]  Ambuj K. Singh,et al.  Efficient Index Structures for String Databases , 2001, VLDB.

[3]  Klemens Böhm,et al.  Proceedings of the International Conference on Very Large Data Bases , 2005 .

[4]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[5]  Qing Wang,et al.  UD(k, l)-Index: An Efficient Approximate Index for XML Data , 2003, WAIM.

[6]  Vishu Krishnamurthy,et al.  Performance Challenges in Object-Relational DBMSs , 1999, IEEE Data Eng. Bull..

[7]  Daniela Florescu,et al.  A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database , 1999 .

[8]  Ioana Manolescu,et al.  Agora: Living with XML and Relational , 2000, VLDB.

[9]  Jeffrey F. Naughton,et al.  On relational support for XML publishing: beyond sorting and tagging , 2003, SIGMOD '03.

[10]  Ahmad Ashari,et al.  Storing And Querying XML Data Using RDBMS , 2004, iiWAS.

[11]  Tok Wang Ling,et al.  From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching , 2005, VLDB.

[12]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[13]  Jeffrey F. Naughton,et al.  On the Integration of Structure Indexes and Inverted Lists , 2004, ICDE.

[14]  Michael J. Franklin,et al.  A Fast Index for Semistructured Data , 2001, VLDB.

[15]  Daniela Florescu,et al.  Storing and Querying XML Data using an RDMBS , 1999, IEEE Data Eng. Bull..

[16]  M. Tamer Özsu,et al.  A succinct physical storage scheme for efficient evaluation of path queries in XML , 2004, Proceedings. 20th International Conference on Data Engineering.

[17]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[18]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[19]  Alon Y. Halevy,et al.  An XML query engine for network-bound data , 2002, The VLDB Journal.

[20]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[21]  Torsten. Grust,et al.  Accelerating XPath location steps , 2002, SIGMOD '02.

[22]  Rajeev Rastogi,et al.  RE-tree: an efficient index structure for regular expressions , 2003, The VLDB Journal.

[23]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[24]  Tok Wang Ling,et al.  On boosting holism in XML twig pattern matching using structural indexing techniques , 2005, SIGMOD '05.

[25]  Guido Moerkotte,et al.  Efficient Storage of XML Data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[26]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[27]  Hao He,et al.  Multiresolution indexing of XML for frequent queries , 2004, Proceedings. 20th International Conference on Data Engineering.

[28]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.

[29]  Andrew Lim,et al.  D(k)-index: an adaptive structural summary for graph-structured data , 2003, SIGMOD '03.

[30]  Kyuseok Shim,et al.  APEX: an adaptive path index for XML data , 2002, SIGMOD '02.

[31]  Sven Helmer,et al.  Anatomy of a native XML base management system , 2002, The VLDB Journal.

[32]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[33]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[34]  Meng Li,et al.  Stream Operators for Querying Data Streams , 2005, WAIM.