Efficient Querying of Distributed Resources in Mediator Systems

This work investigates the integration of heterogeneous resources, such as data and programs, in a fully distributed peer-to-peer mediation architecture. The challenge in making such a system succeed at a large scale is twofold. First, we need a simple concept for modeling resources. Second, we need efficient operators for distributed query execution, capable of handling well costly computations and large data transfers. To model heterogeneous resources, we use the model of table with binding patterns. To exploit a resource with restricted binding patterns, we propose an efficient BindJoin operator, optimized for minimizing large data transfers and costly computations. Furthermore, the proposed BindJoin operator delivers most of its output in the early stages of the execution, which is an important asset in a system meant for human interaction. Our experimental evaluation validates the proposed BindJoin algorithm on queries involving expensive programs.

[1]  Luc Bouganim,et al.  Processing queries with expensive functions and large objects in distributed mediator systems , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Jeffrey D. Ullman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS '95.

[3]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[4]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[5]  Roy Goldman,et al.  WSQ/DSQ: a practical approach for combined querying of databases and the Web , 2000, SIGMOD '00.

[6]  Surajit Chaudhuri,et al.  Optimization of queries with user-defined predicates , 1996, TODS.

[7]  Kyuseok Shim,et al.  Query Optimization in the Presence of Foreign Functions , 1993, VLDB.

[8]  Guido Moerkotte,et al.  Efficient Dynamic Programming Algorithms for Ordering Expensive Joins and Selections , 1998, EDBT.

[9]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[10]  Praveen Seshadri,et al.  Client-site query extensions , 1999, SIGMOD '99.

[11]  Alfons Kemper,et al.  ObjectGlobe: Ubiquitous query processing on the Internet , 2001, The VLDB Journal.

[12]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[13]  Patrick Valduriez,et al.  The Ecobase project: database and web technologies for environmental information systems , 2001, SGMD.

[14]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[15]  Philip A. Bernstein,et al.  Using Semi-Joins to Solve Relational Queries , 1981, JACM.

[16]  SuciuDan,et al.  Query optimization in the presence of limited access patterns , 1999 .

[17]  Anand Rajaraman,et al.  Answering queries using templates with binding patterns (extended abstract) , 1995, PODS.

[18]  Michael Stonebraker,et al.  Predicate migration: optimizing queries with expensive predicates , 1992, SIGMOD Conference.

[19]  Ioana Manolescu,et al.  Efficient Data and Program Integration Using Binding Patterns , 2002, BDA.

[20]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[21]  Nick Roussopoulos,et al.  MOCHA: a self-extensible database middleware system for distributed data sources , 2000, SIGMOD '00.

[22]  Jeffrey F. Naughton,et al.  Query execution techniques for caching expensive methods , 1996, SIGMOD '96.

[23]  Ioana Manolescu,et al.  Query optimization in the presence of limited access patterns , 1999, SIGMOD '99.

[24]  Joseph M. Hellerstein,et al.  Online Dynamic Reordering for Interactive Data Processing , 1999, VLDB.