Fast Multi-update Operations on Compressed XML Data

Grammar-based XML compression reduces the volume of big XML data collections, but fast updates of compressed data may become a bottleneck. An open question still was, given an XPath Query and an update operation, how to efficiently compute the update positions within a grammar representing a compressed XML file. In this paper, we propose an automaton-based solution, which computes these positions, combines them in a so-called Update DAG, supports parallel updates, and uses dynamic programming to avoid an implicit decompression of the grammar. As a result, our solution updates compressed XML even faster than MXQuery and Qizx update uncompressed XML.

[1]  Jayant R. Haritsa,et al.  XGrind: a query-friendly XML compressor , 2002, Proceedings 18th International Conference on Data Engineering.

[2]  Ioana Manolescu,et al.  XMark: A Benchmark for XML Data Management , 2002, VLDB.

[3]  Jessie Kennedy,et al.  Advances in Databases , 1996, Lecture Notes in Computer Science.

[4]  Tim Furche,et al.  XPath: Looking Forward , 2002, EDBT Workshops.

[5]  Peter Buneman,et al.  Edinburgh Research Explorer Path Queries on Compressed XML , 2022 .

[6]  Neel Sundaresan,et al.  Millau: an encoding format for efficient representation and exchange of XML over the Web , 2000, Comput. Networks.

[7]  Stefan Böttcher,et al.  XML Stream Data Reduction by Shared KST Signatures , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[8]  Sebastian Maneth,et al.  Structural Selectivity Estimation for XML Documents , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Stefan Böttcher,et al.  Updates on Grammar-Compressed XML Data , 2011, BNCOD.

[10]  James Cheney Compressing XML with multiplexed hierarchical PPM models , 2001, Proceedings DCC 2001. Data Compression Conference.

[11]  Stefan Böttcher,et al.  CluX - Clustering XML Sub-trees , 2010, ICEIS.

[12]  Sebastian Maneth,et al.  Tree Structure Compression with RePair , 2011, 2011 Data Compression Conference.

[13]  Wilfred Ng,et al.  XQzip: Querying Compressed XML Using Structural Indexing , 2004, EDBT.

[14]  Massimo Franceschet XPathMark: An XPath Benchmark for the XMark Generated Data , 2005, XSym.

[15]  Xmldm,et al.  XML-Based Data Management and Multimedia Engineering — EDBT 2002 Workshops , 2002, Lecture Notes in Computer Science.

[16]  Stefan Böttcher,et al.  Evaluating XPath Queries on XML Data Streams , 2007, BNCOD.

[17]  Jan Chomicki,et al.  Hippo: A System for Computing Consistent Answers to a Class of SQL Queries , 2004, EDBT.

[18]  Dan Suciu,et al.  Database and XML Technologies , 2004, Lecture Notes in Computer Science.

[19]  Gonzalo Navarro,et al.  Lempel-Ziv compression of structured text , 2004, Data Compression Conference, 2004. Proceedings. DCC 2004.

[20]  Mark Levene,et al.  XCQ: A queriable XML compression system , 2006, Knowledge and Information Systems.

[21]  Christian Werner,et al.  Compressing SOAP Messages by using Pushdown Automata , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[22]  Stefan Böttcher,et al.  Mixing Bottom-Up and Top-Down XPath Query Evaluation , 2011, ADBIS.

[23]  M. Tamer Özsu,et al.  A succinct physical storage scheme for efficient evaluation of path queries in XML , 2004, Proceedings. 20th International Conference on Data Engineering.

[24]  L. Nelson Data, data everywhere. , 1997, Critical Care Medicine.

[25]  R. Watson,et al.  Data Management , 1980, Bone Marrow Transplantation.

[26]  Dan Suciu,et al.  XMill: an efficient compressor for XML data , 2000, SIGMOD 2000.

[27]  Sebastian Maneth,et al.  Efficient Memory Representation of XML Documents , 2005, DBPL.

[28]  Chin-Wan Chung,et al.  XPRESS: a queriable compression for XML data , 2003, SIGMOD '03.