DDE: from dewey to a fully dynamic XML labeling scheme

Labeling schemes lie at the core of query processing for many XML database management systems. Designing labeling schemes for dynamic XML documents is an important problem that has received a lot of research attention. Existing dynamic labeling schemes, however, often sacrifice query performance and introduce additional labeling cost to facilitate arbitrary updates even when the documents actually seldom get updated. Since the line between static and dynamic XML documents is often blurred in practice, we believe it is important to design a labeling scheme that is compact and efficient regardless of whether the documents are frequently updated or not. In this paper, we propose a novel labeling scheme called DDE (for Dynamic DEwey) which is tailored for both static and dynamic XML documents. For static documents, the labels of DDE are the same as those of dewey which yield compact size and high query performance. When updates take place, DDE can completely avoid re-labeling and its label quality is most resilient to the number and order of insertions compared to the existing approaches. In addition, we introduce Compact DDE (CDDE) which is designed to optimize the performance of DDE for insertions. Both DDE and CDDE can be incorporated into existing systems and applications that are based on dewey labeling scheme with minimum efforts. Experiment results demonstrate the benefits of our proposed labeling schemes over the previous approaches.

[1]  Tok Wang Ling,et al.  A Dynamic Labeling Scheme Using Vectors , 2007, DEXA.

[2]  Tok Wang Ling,et al.  QED: a novel quaternary encoding to completely avoid re-labeling in XML updates , 2005, CIKM '05.

[3]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[4]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[5]  Edith Cohen,et al.  Labeling dynamic XML trees , 2002, PODS '02.

[6]  X. Wu,et al.  A prime number labeling scheme for dynamic ordered XML trees , 2004, Proceedings. 20th International Conference on Data Engineering.

[7]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[8]  Stephen Alstrup,et al.  Compact Labeling Scheme for Ancestor Queries , 2006, SIAM J. Comput..

[9]  Chee Yong Chan,et al.  Multiway SLCA-based keyword search in XML data , 2007, WWW '07.

[10]  Tok Wang Ling,et al.  Efficient Processing of Updates in Dynamic XML Data , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[11]  Yannis Papakonstantinou,et al.  Efficient keyword search for smallest LCAs in XML databases , 2005, SIGMOD '05.

[12]  Toshiyuki Amagasa,et al.  QRS: a robust numbering scheme for XML documents , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[13]  David J. DeWitt,et al.  On supporting containment queries in relational database management systems , 2001, SIGMOD '01.

[14]  Patrick E. O'Neil,et al.  ORDPATHs: insert-friendly XML node labels , 2004, SIGMOD '04.

[15]  Menzo Windhouwer,et al.  Querying XML documents made easy: nearest concept queries , 2001, Proceedings 17th International Conference on Data Engineering.

[16]  Tok Wang Ling,et al.  Efficient updates in dynamic XML data: from binary string to quaternary string , 2008, The VLDB Journal.