CIS-X: A compacted indexing scheme for efficient query evaluation of XML documents

Some indexing and query evaluation methods have been proposed for accelerating query processing in XML documents. The structural summary approach reduces the portion of the XML to be scanned during query processing. However, most of the methods based on this approach cannot support complex queries efficiently and/or encounter a long index construction time and a huge index size. Many query processing methods focus on processing twig pattern matching and these have developed various structures to store intermediate results. The problem with these query processing methods include generating huge intermediate results, an expensive merging phase, complicated data structures, and a requirement to scan all potential nodes. This paper proposes a compacted index scheme for XML documents called CIS-X, which combines the advantages of the structural summary and query processing methods. The experimental results show that the CIS-X can resolve most of the above problems, and usually outperforms existing techniques.

[1]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[2]  Ehud Gudes,et al.  Exploiting local similarity for indexing paths in graph-structured data , 2002, Proceedings 18th International Conference on Data Engineering.

[3]  Jian Liu,et al.  Efficient labeling scheme for dynamic XML trees , 2013, Inf. Sci..

[4]  Michael J. Franklin,et al.  A Fast Index for Semistructured Data , 2001, VLDB.

[5]  Bo Zhang,et al.  SIMP: Efficient XML Structural Index for Multiple Query Processing , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[6]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[7]  SangKeun Lee,et al.  Examining the impact of data-access cost on XML twig pattern matching , 2012, Inf. Sci..

[8]  Hua-Gang Li,et al.  Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents , 2006, VLDB.

[9]  Andrew Lim,et al.  Enabling structural summaries for efficient update and workload adaptation , 2008, Data Knowl. Eng..

[10]  Su-Cheng Haw,et al.  Evolution of Structural Path Indexing Techniques in XML Databases: A Survey and Open Discussion , 2008, 2008 10th International Conference on Advanced Communication Technology.

[11]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[12]  Jeffrey Xu Yu,et al.  TwigList : Make Twig Pattern Matching Fast , 2007, DASFAA.

[13]  Jian Liu,et al.  Matching twigs in fuzzy XML , 2011, Inf. Sci..

[14]  Chen Wang,et al.  Extended XML Tree Pattern Matching: Theories and Algorithms , 2011, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jignesh M. Patel,et al.  Structural joins: a primitive for efficient XML query pattern matching , 2002, Proceedings 18th International Conference on Data Engineering.

[16]  Xin Wu,et al.  XML twig pattern matching using version tree , 2008, Data Knowl. Eng..

[17]  Divesh Srivastava,et al.  Index structures for matching XML twigs using relational query processors , 2007, Data Knowl. Eng..

[18]  Tansel Özyer,et al.  TempoXML: Nested bitemporal relationship modeling and conversion tool for fuzzy XML , 2012, Inf. Sci..

[19]  Jongik Kim Advanced structural joins using element distribution , 2006, Inf. Sci..

[20]  I-En Liao,et al.  A Cloud Computing Implementation of XML Indexing Method Using Hadoop , 2012, ACIIDS.

[21]  Su-Cheng Haw,et al.  Data storage practices and query processing in XML databases: A survey , 2011, Knowl. Based Syst..

[22]  Theo Härder,et al.  S3: Evaluation of Tree-Pattern Queries Supported by Structural Summaries , 2009, Data Knowl. Eng..

[23]  Kyuseok Shim,et al.  APEX: an adaptive path index for XML data , 2002, SIGMOD '02.

[24]  Andrew Lim,et al.  D(k)-index: an adaptive structural summary for graph-structured data , 2003, SIGMOD '03.

[25]  Su-Cheng Haw,et al.  Extending path summary and region encoding for efficient structural query processing in native XML databases , 2009, J. Syst. Softw..

[26]  Chin-Wan Chung,et al.  An efficient XML encoding and labeling method for query processing and updating on dynamic XML data , 2009, J. Syst. Softw..

[27]  Divesh Srivastava,et al.  Holistic twig joins: optimal XML pattern matching , 2002, SIGMOD '02.

[28]  Yu-Lin Chen,et al.  An Efficient Indexing and Compressing Scheme for XML Query Processing , 2010, NDT.

[29]  Zhongwei Ren,et al.  Path-based XML Relational Storage Approach , 2012 .