PartiX : processing XQuery queries over fragmented XML repositories

The data volume of XML repositories and the response time of query processing have become critical issues for many applications, especially for those in the Web. An interesting alternative to improve query processing performance consists in reducing the size of XML databases through fragmentation techniques. However, traditional fragmentation definitions do not directly apply to collections of XML documents. This work formalizes the fragmentation definition for collections of XML documents, and proposes an architecture for XQuery processing on top of fragmented XML data. This architecture was implemented in a system prototype named PartiX, which exploits intra-query parallelism on top of XQueryenabled sequential DBMS modules. We have analyzed several experimental settings, and our results showed a performance improvement of up to a 72 scale up factor against centralized databases.

[1]  Ming Zhang,et al.  XML algebras for data mining , 2004, SPIE Defense + Commercial Sensing.

[2]  Sihem Amer-Yahia,et al.  A Web-services architecture for efficient XML data exchange , 2004, Proceedings. 20th International Conference on Data Engineering.

[3]  Wolfgang Meier,et al.  eXist: An Open Source Native XML Database , 2002, Web, Web-Services, and Database Systems.

[4]  H. Schoning Tamino - a DBMS designed for XML , 2001, Proceedings 17th International Conference on Data Engineering.

[5]  Elisa Bertino,et al.  XJoin index: indexing XML data for efficient handling of branching path expressions , 2004, Proceedings. 20th International Conference on Data Engineering.

[6]  Alberto O. Mendelzon,et al.  Indexing XML Data with ToXin , 2001, WebDB.

[7]  M. Tamer Özsu,et al.  XBench benchmark and performance testing of XML DBMSs , 2004, Proceedings. 20th International Conference on Data Engineering.

[8]  Susan B. Davidson,et al.  BLAS: an efficient XPath processing system , 2004, SIGMOD '04.

[9]  Anneli Folkesson,et al.  World Wide Web Consortium (W3C) , 2005 .

[10]  Marta Mattoso,et al.  Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster , 2004, J. Inf. Data Manag..

[11]  ZaveruchaGerson,et al.  A Distribution Design Methodology for Object DBMS , 2004 .

[12]  Leonidas Fegaras,et al.  XFrag: A Query Processing Framework for Fragmented XML Data , 2005, WebDB.

[13]  Alfredo Cuzzocrea,et al.  XPath lookup queries in P2P networks , 2004, WIDM '04.

[14]  Sven Helmer,et al.  Natix: A Technology Overview , 2002, Web, Web-Services, and Database Systems.

[15]  Jan Rittinger,et al.  Pathfinder/MonetDB: A High-Performance Relational Runtime for XQuery , 2005, Grundlagen von Datenbanken.

[16]  Elke A. Rundensteiner,et al.  Honey, I shrunk the XQuery!: an XML algebra optimization approach , 2002, WIDM '02.

[17]  Denilson Barbosa,et al.  ToXgene: a template-based data generator for XML , 2002, SIGMOD '02.

[18]  Marta Mattoso,et al.  A Distribution Design Methodology for Object DBMS , 2004, Distributed and Parallel Databases.

[19]  Michael Gertz,et al.  On Distributing XML Repositories , 2003, WebDB.

[20]  Shamkant B. Navathe,et al.  A Mixed Fragmentation Methodology For Initial Distributed Database Design , 1995 .

[21]  Laks V. S. Lakshmanan,et al.  Tree logical classes for efficient evaluation of XQuery , 2004, SIGMOD '04.

[22]  Laks V. S. Lakshmanan,et al.  TAX: A Tree Algebra for XML , 2001, DBPL.

[23]  David Levine,et al.  A Query Algebra for Fragmented XML Stream Data , 2003, DBPL.

[24]  Ioana Manolescu,et al.  Towards Cost-based Optimization for Data-intensive Web Service Computations , 2004, SBBD.

[25]  Klaus-Dieter Schewe,et al.  Fragmentation of XML Documents , 2010, J. Inf. Data Manag..

[26]  Philip Wadler,et al.  An Algebra for XML Query , 2000, FSTTCS.

[27]  Ioana Manolescu,et al.  Dynamic XML documents with distribution and replication , 2003, SIGMOD '03.

[28]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .