On wrapping query languages and efficient XML integration

Modern applications (Web portals, digital libraries, etc.) require integrated access to various information sources (from traditional DBMS to semistructured Web repositories), fast deployment and low maintenance cost in a rapidly evolving environment. Because of its flexibility, there is an increasing interest in using XML as a middleware model for such applications. XML enables fast wrapping and declarative integration. However, query processing in XML-based integration systems is still penalized by the lack of an algebra with adequate optimization properties and the difficulty to understand source query capabilities. In this paper, we propose an algebraic approach to support efficient XML query evaluation. We define a general purpose algebra suitable for semistructured on XML query languages. We show how this algebra can be used, with appropriate type information, to also wrap more structured query languages such as OQL or SQL. Finally, we develop new optimization techniques for XML-based integration systems.

[1]  Serge Abiteboul,et al.  From structured documents to novel query facilities , 1994, SIGMOD '94.

[2]  Catriel Beeri,et al.  Schemas for Integration and Translation of Structured and Semi-structured Data , 1999, ICDT.

[3]  Dan Suciu,et al.  Optimizing regular path expressions using graph schemas , 1998, Proceedings 14th International Conference on Data Engineering.

[4]  Ioana Manolescu,et al.  Query optimization in the presence of limited access patterns , 1999, SIGMOD '99.

[5]  Vassilis Christophides,et al.  The Aquarelle Resource Discovery System , 1998, Comput. Networks.

[6]  Kyuseok Shim,et al.  Including Group-By in Query Optimization , 1994, VLDB.

[7]  Alan R. Simon,et al.  Understanding the New SQL: A Complete Guide , 1993 .

[8]  Jennifer Widom,et al.  The Lorel query language for semistructured data , 1997, International Journal on Digital Libraries.

[9]  Sihem Amer-Yahia,et al.  Bulk-Loading Techniques for Object Databases and an Application to Relational Data , 1998, VLDB.

[10]  Ling Liu,et al.  Accessing heterogeneous data through homogenization and integration mediators , 1997, Proceedings of CoopIS 97: 2nd IFCIS Conference on Cooperative Information Systems.

[11]  Jennifer Widom,et al.  Object exchange across heterogeneous information sources , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[13]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[14]  Yannis Papakonstantinou,et al.  Object Fusion in Mediator Systems , 1996, VLDB.

[15]  Chaitanya K. Baru,et al.  XML-based information mediation with MIX , 1999, SIGMOD '99.

[16]  Guido Moerkotte,et al.  Nested Queries in Object Bases , 1993, DBPL.

[17]  Jennifer Widom,et al.  Query Optimization for XML , 1999, VLDB.

[18]  Press Niso Information Retrieval Application Service Definition and Protocol Specification for Open Systems Interconnection, Z39.50-1995 , 1994 .

[19]  Limsoon Wong,et al.  A Data Transformation System for Biological Data Sources , 1995, VLDB.

[20]  David Jordan,et al.  The Object Database Standard: ODMG 2.0 , 1997 .

[21]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[22]  Patrick Valduriez,et al.  Dealing with Discrepancies in Wrapper Functionality , 1997, BDA.

[23]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[24]  Catriel Beeri,et al.  SAL: An Algebra for Semistructured Data and XML , 1999, WebDB.

[25]  Jeffrey D. Ullman,et al.  A Query Translation Scheme for Rapid Implementation of Wrappers , 1995, DOOD.

[26]  Guido Moerkotte,et al.  Evaluating queries with generalized path expressions , 1996, SIGMOD '96.

[27]  Laura M. Haas,et al.  Towards heterogeneous multimedia information systems: the Garlic approach , 1995, Proceedings RIDE-DOM'95. Fifth International Workshop on Research Issues in Data Engineering-Distributed Object Management.

[28]  Patrick Valduriez,et al.  Scaling heterogeneous databases and the design of Disco , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[29]  Dan Suciu,et al.  Warehousing and incremental evaluation for Web Site management , 1998, BDA.

[30]  Calton Pu,et al.  An adaptive approach to query mediation across heterogeneous information sources , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.