Processing Strategy for Global XQuery Queries Based on XQuery Join Cost

XML is a standard for exchanging and formatting data over the Internet and XQuery is a standard query language for searching and integrating XML data. Therefore, it is a natural choice for interoperability to use XQuery over the Internet. Global XQuery queries search and integrate heterogeneous data, being distributed in the local systems. In order to process efficiently global XQuery queries, their processing strategy is important because an improper processing strategy could produce an enormous number of intermediate results or execute redundant expressions. In distributed relational databases, there are some techniques for processing global SQL queries. Unfortunately, however, the structure of the data handled by the XQuery language is quite different from the one by the SQL. The XQuery language deals with semi-structural data, i.e. tree-structured data, while SQL deals with well-structured data, i.e., the table-shaped data. These structural differences make it difficult to apply the techniques for global SQL queries into for global XQuery queries. Especially this paper considers the join cost for devising a query processing strategy. Therefore, we define some problems for estimating the join cost in XQuery queries and propose ECNJ algorithm for solving these problems. Also this paper proposes the query processing strategy and evaluates the strategy by implementing a prototype system.

[1]  Calton Pu,et al.  Distributed Query Scheduling Service: An Architecture and Its Implementation , 1998, Int. J. Cooperative Inf. Syst..

[2]  Guido Moerkotte,et al.  Heuristic and randomized optimization for the join ordering problem , 1997, The VLDB Journal.

[3]  Eugene J. Shekita,et al.  Querying XML Views of Relational Data , 2001, VLDB.

[4]  Norman May,et al.  XQuery Processing in Natix with an Emphasis on Join Ordering , 2004, XIME-P.

[5]  Georges Gardarin,et al.  Integrating heterogeneous data sources with XML and XQuery , 2002, Proceedings. 13th International Workshop on Database and Expert Systems Applications.

[6]  Ioana Manolescu,et al.  Answering XML Queries on Heterogeneous Data Sources , 2001, VLDB.

[7]  Jeffrey F. Naughton,et al.  Estimating the Selectivity of XML Path Expressions for Internet Scale Applications , 2001, VLDB.

[8]  Phillip C.-Y. Sheu,et al.  Adaptive Join Algorithms in Dynamic Distributed Databases , 2004, Distributed and Parallel Databases.

[9]  Kevin P. Hinshaw,et al.  Distributed XQuery , 2004 .

[10]  Dennis McLeod,et al.  A Probe-Based Technique to Optimize Join Queries in Distributed Internet Databases , 2000, Knowledge and Information Systems.

[11]  David J. DeWitt,et al.  Mixed Mode XML Query Processing , 2003, VLDB.

[12]  Guido Moerkotte,et al.  Optimizing Join Orders , 1993 .

[13]  Scott Boag,et al.  XQuery 1.0 : An XML Query Language , 2007 .

[14]  Dae Hyun Hwang,et al.  XML View Materialization with Deferred Incremental Refresh: the Case of a Restricted Class of Views , 2005, J. Inf. Sci. Eng..

[15]  Juliana Freire,et al.  StatiX: making XML count , 2002, SIGMOD '02.

[16]  Maria E. Orlowska,et al.  An efficient processing of a chain join with the minimum communication cost in distributed database systems , 2005, Distributed and Parallel Databases.