Lazy XML updates: laziness as a virtue, of update and structural join efficiency

XML documents are normally stored as plain text files. Hence, the natural and most convenient way to update XML documents is to simply edit the text files. But efficient query evaluation algorithms require XML documents to be indexed. Every element is given a unique identifier based on its location in the document or its preorder-traversal order, and this identifier is later used as (part of) the key in the index. Reassigning orders of possibly a large number of elements is therefore necessary when the original XML documents are updated. Immutable dynamic labeling schemes have been proposed to solve this problem, that, however, require very long labels and may decrease query performance. If we consider a real-world scenario, we note that many relatively small ad-hoc XML segments are inserted/deleted into/from an existing XML database. In this paper, we start from this consideration and we propose a new lazy approach to handle XML updates that also improves query performance. The lazy approach: (i) completely avoids reassigning existing element orders after updates; (ii) improves query processing by taking advantages from segments. Experimental results show that our approach is much more efficient in handling updates than using immutable labeling and, at the same time, it also improves the performance of recently defined structural join algorithms.

[1]  Patrick E. O'Neil,et al.  ORDPATHs: insert-friendly XML node labels , 2004, SIGMOD '04.

[2]  Ioana Manolescu,et al.  The XML benchmark project , 2001 .

[3]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[5]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[6]  X. Wu,et al.  A prime number labeling scheme for dynamic ordered XML trees , 2004, Proceedings. 20th International Conference on Data Engineering.

[7]  Hao He,et al.  BOXes: efficient maintenance of order-based labeling for dynamic XML data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[8]  Edith Cohen,et al.  Labeling dynamic XML trees , 2002, SIAM J. Comput..

[9]  Jeffrey F. Naughton,et al.  Updates for Structure Indexes , 2002, VLDB.

[10]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[11]  Hao He,et al.  Incremental maintenance of XML structural indexes , 2004, SIGMOD '04.

[12]  Alon Y. Halevy,et al.  Updating XML , 2001, SIGMOD '01.

[13]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[14]  Carlo Zaniolo,et al.  Efficient Structural Joins on Indexed XML Documents , 2002, VLDB.

[15]  Beng Chin Ooi,et al.  XR-tree: indexing XML data for efficient structural joins , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).