Querying and maintaining a compact XML storage

As XML database sizes grow, the amount of space used for storing the data and auxiliary data structures becomes a major factor in query and update performance. This paper presents a new storage scheme for XML data that supports all navigational operations in near constant time. In addition to supporting efficient queries, the space requirement of the proposed scheme is within a constant factor of the information theoretic minimum, while insertions and deletions can be performed in near constant time as well. As a result, the proposed structure features a small memory footprint that increases cache locality, whilst still supporting standard APIs, such as DOM, and necessary database operations, such as queries and updates, efficiently. Analysis and experiments show that the proposed structure is space and time efficient.

[1]  Cong Yu,et al.  TIMBER: A native XML database , 2002, The VLDB Journal.

[2]  Marcus Fontoura,et al.  Querying XML streams , 2005, The VLDB Journal.

[3]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[4]  Peter Buneman,et al.  Edinburgh Research Explorer Path Queries on Compressed XML , 2022 .

[5]  Sriram Padmanabhan,et al.  L-Tree: A Dynamic Labeling Structure for Ordered XML Data , 2004, EDBT Workshops.

[6]  Sebastian Maneth,et al.  Tree Transducers and Tree Compressions , 2004, FoSSaCS.

[7]  Erkki Mäkinen,et al.  Tree Compression and Optimization with Applications , 1990, Int. J. Found. Comput. Sci..

[8]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[9]  Chin-Wan Chung,et al.  XPRESS: a queriable compression for XML data , 2003, SIGMOD '03.

[10]  Venkatesh Raman,et al.  Representing dynamic binary trees succinctly , 2001, SODA '01.

[11]  Jayant R. Haritsa,et al.  XGrind: a query-friendly XML compressor , 2002, Proceedings 18th International Conference on Data Engineering.

[12]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[13]  Hao He,et al.  BOXes: efficient maintenance of order-based labeling for dynamic XML data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .

[15]  M. Tamer Özsu,et al.  A succinct physical storage scheme for efficient evaluation of path queries in XML , 2004, Proceedings. 20th International Conference on Data Engineering.

[16]  Torsten Grust,et al.  Accelerating XPath evaluation in any RDBMS , 2004, TODS.

[17]  Dan Suciu,et al.  XMill: an efficient compressor for XML data , 2000, SIGMOD '00.

[18]  David J. DeWitt,et al.  Mixed Mode XML Query Processing , 2003, VLDB.

[19]  Fabrizio Luccio,et al.  Compressing and searching XML data via two zips , 2006, WWW '06.

[20]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.

[21]  R. Apweiler Protein sequence databases. , 2000, Advances in protein chemistry.