Path Materialization Revisited: An Efficient Storage Model for XML Data

XML is emerging as a new major standard for representing data on the world wide web. Several XML storage models have been proposed to store XML data in different database management systems. The unique feature of model-mapping-based approaches is that no DTD information is required for XML data storage. In this paper, we present a new model-mapping-based storage model, called XParent. Unlike the existing work on model-mapping-based approaches that emphasized on converting XML documents to/from database schema and translation of XML queries into SQL queries, in this paper, we focus ourselves on the effectiveness of storage models in terms of query processing. We study the key issues that affect query performance, namely, storage schema design (storing XML data across multiple tables) and path materialization (storing path information in databases). We show that similar but different storage models significantly affect query performance. A performance study is conducted using three data sets and query sets. The experimental results are presented.

[1]  Menzo Windhouwer,et al.  Efficient Relational Storage and Retrieval of XML Documents , 2000, WebDB.

[2]  Gerti Kappel,et al.  X-Ray - Towards Integrating XML and Relational Database Systems , 2000, ER.

[3]  Steven J. DeRose,et al.  XML Path Language (XPath) , 1999 .

[4]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[5]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[6]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[7]  Dongwon Lee,et al.  Comparative analysis of six XML schema languages , 2000, SGMD.

[8]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[9]  Masatoshi Yoshikawa,et al.  An XML indexing structure with relative region coordinate , 2001, Proceedings 17th International Conference on Data Engineering.

[10]  David J. DeWitt,et al.  Relational Databases for Querying XML Documents: Limitations and Opportunities , 1999, VLDB.

[11]  Dongwon Lee,et al.  Constraints-Preserving Transformation from XML Document Type Definition to Relational Schema , 2000, ER.

[12]  Alin Deutsch,et al.  Storing Semistructured Data in Relations , 1999, ICDT 1999.

[13]  Letizia Tanca,et al.  XML-GL: A Graphical Language for Querying and Restructuring XML Documents , 1999, SEBD.

[14]  Daniela Florescu,et al.  A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database , 1999 .

[15]  Donald D. Chamberlin,et al.  XQuery: a query language for XML , 2003, SIGMOD '03.

[16]  Ioana Manolescu,et al.  The XML benchmark project , 2001 .

[17]  Alin Deutsch,et al.  Storing semistructured data with STORED , 1999, SIGMOD '99.

[18]  Dan Suciu,et al.  Catching the boat with Strudel: experiences with a Web-site management system , 1998, SIGMOD '98.

[19]  Alin Deutsch,et al.  Workshop on Query Processing for Semistructured Data and Non-Standard Data Formats , 1999 .

[20]  Stefano Ceri,et al.  Comparative analysis of five XML query languages , 1999, SGMD.

[21]  Guido Moerkotte,et al.  Efficient Storage of XML Data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[22]  Toshiyuki Amagasa,et al.  XRel: a path-based approach to storage and retrieval of XML documents using relational databases , 2001, ACM Trans. Internet Techn..

[23]  David J. DeWitt,et al.  The design and performance evaluation of alternative XML storage strategies , 2002, SGMD.

[24]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.