QED: a novel quaternary encoding to completely avoid re-labeling in XML updates

The method of assigning labels to the nodes of the XML tree is called a labeling scheme. Based on the labels only, both ordered and un-ordered queries can be processed without accessing the original XML file. One more important point for the labeling scheme is the label update cost in inserting or deleting a node into or from the XML tree. All the current labeling schemes have high update cost, therefore in this paper we propose a novel quaternary encoding approach for the labeling schemes. Based on this encoding approach, we need not re-label any existing nodes when the update is performed. Extensive experimental results on the XML datasets illustrate that our QED works much better than the existing labeling schemes on the label updates when considering either the number of nodes or the time for re-labeling.

[1]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[2]  Tok Wang Ling,et al.  An Improved Prefix Labeling Scheme: A Binary String Approach for Dynamic Ordered XML , 2005, DASFAA.

[3]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[4]  Jeffrey D. Ullman,et al.  Representative objects: concise representations of semistructured, hierarchical data , 1997, Proceedings 13th International Conference on Data Engineering.

[5]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[6]  Toshiyuki Amagasa,et al.  QRS: a robust numbering scheme for XML documents , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Edith Cohen,et al.  Labeling dynamic XML trees , 2002, PODS '02.

[8]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[9]  X. Wu,et al.  A prime number labeling scheme for dynamic ordered XML trees , 2004, Proceedings. 20th International Conference on Data Engineering.

[10]  Hao He,et al.  BOXes: efficient maintenance of order-based labeling for dynamic XML data , 2005, 21st International Conference on Data Engineering (ICDE'05).

[11]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[12]  Patrick E. O'Neil,et al.  ORDPATHs: insert-friendly XML node labels , 2004, SIGMOD '04.

[13]  Yanchun Zhang,et al.  LSDX: A New Labelling Scheme for Dynamically Updating XML Data , 2005, ADC.

[14]  Alexander Borgida,et al.  Efficient management of transitive relationships in large data and knowledge bases , 1989, SIGMOD '89.

[15]  Francois Yergeau UTF-8, a transformation format of ISO 10646 , 1998, RFC.

[16]  W. Li,et al.  Number theory with applications , 1996 .

[17]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[18]  Tok Wang Ling,et al.  On reducing redundancy and improving efficiency of XML labeling schemes , 2005, CIKM '05.

[19]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[20]  Haim Kaplan,et al.  Compact labeling schemes for ancestor queries , 2001, SODA '01.