Query Processing and Optimization Techniques over Streamed Fragmented XML

With the extensive use of XML in applications over the Web, efficient query processing over streaming XML has become a core challenge due to one-pass processing and limited resources. Taking advantage of Hole-Filler model for XML fragments, this paper proposes a hybrid structure (FQ-Index) for both the queries and fragments, and proposes an XML fragment processing algorithm to evaluate forward XPath queries over streamed XML fragments. Two optimization rules, dependence pruning and prefix pruning are also developed. Dependence pruning scheme prunes off the dependent operations caused by fragmentation and transforms the queries for XML tag into queries for XML fragments, while prefix pruning scheme prunes off the “redundant” prefix along the path according to the tag structure. The effectiveness of the techniques developed is illustrated with a detailed set of experiments.

[1]  Michael J. Franklin,et al.  Efficient Filtering of XML Documents for Selective Dissemination of Information , 2000, VLDB.

[2]  Yanlei Diao,et al.  YFilter: efficient and scalable filtering of XML documents , 2002, Proceedings 18th International Conference on Data Engineering.

[3]  David Levine,et al.  Query processing of streamed XML data , 2002, CIKM '02.

[4]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[5]  Elke A. Rundensteiner,et al.  Automaton meets algebra: A hybrid paradigm for XML stream processing , 2006, Data Knowl. Eng..

[6]  Mong-Li Lee,et al.  Efficient evaluation of multiple queries on streaming XML data , 2002, CIKM '02.

[7]  Quanzhong Li,et al.  Indexing and Querying XML Data for Regular Path Expressions , 2001, VLDB.

[8]  David Levine,et al.  A Query Algebra for Fragmented XML Stream Data , 2003, DBPL.

[9]  Leonidas Fegaras,et al.  Data stream management for historical XML data , 2004, SIGMOD '04.

[10]  Dan Suciu,et al.  Index Structures for Path Expressions , 1999, ICDT.

[11]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[12]  Roy Goldman,et al.  Lore: a database management system for semistructured data , 1997, SGMD.

[13]  Leonidas Fegaras,et al.  XFrag: A Query Processing Framework for Fragmented XML Data , 2005, WebDB.

[14]  Guo-RenWang,et al.  RPE Query Processing and Optimization Techniques for XML Databases , 2004 .

[15]  Hongjun Lu,et al.  VXMLR: A Visual XML-Relational Database System , 2001, VLDB.

[16]  Hong Va Leong,et al.  Efficient management of XML contents over wireless environment by Xstream , 2004, SAC '04.

[17]  C. M. Sperberg-McQueen,et al.  Extensible markup language , 1997 .

[18]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[19]  Fusheng Wang,et al.  An XML-Based Approach to Publishing and Querying the History of Databases , 2005, World Wide Web.