Eddies: continuously adaptive query processing

In large federated and shared-nothing databases, resources can exhibit widely fluctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are ineffective in these environments. In this paper we introduce a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs. We characterize the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated. By combining eddies with appropriate join algorithms, we merge the optimization and execution phases of query processing, allowing each tuple to have a flexible ordering of the query operators. This flexibility is controlled by a combination of fluid dynamics and a simple learning algorithm. Our initial implementation demonstrates promising results, with eddies performing nearly as well as a static optimizer/executor in static scenarios, and providing dramatic improvements in dynamic execution environments.

[1]  Michael Stonebraker,et al.  The design and implementation of INGRES , 1976, TODS.

[2]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[3]  Carlo Zaniolo,et al.  Optimization of Nonrecursive Queries , 1986, VLDB.

[4]  Donovan A. Schneider,et al.  The Gamma Database Machine Project , 1990, IEEE Trans. Knowl. Data Eng..

[5]  Goetz Graefe,et al.  Encapsulation of parallelism in the Volcano query processing system , 1990, SIGMOD '90.

[6]  Eduardo D. Sontag,et al.  Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[7]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[8]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[9]  Goetz Graefe,et al.  Optimization of dynamic query evaluation plans , 1994, SIGMOD '94.

[10]  William E. Weihl,et al.  Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.

[11]  Mohamed Ziauddin,et al.  Query processing and optimization in Oracle Rdb , 1996, The VLDB Journal.

[12]  Laurent Amsaleg,et al.  Scrambling query plans to cope with unexpected delays , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[13]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[14]  R. V. Meter Observing the effects of multi-zone disks , 1997 .

[15]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[16]  Peter J. Haas,et al.  The New Jersey Data Reduction Report , 1997 .

[17]  Laura M. Haas,et al.  Optimizing Queries Across Diverse Data Sources , 1997, VLDB.

[18]  Andrea C. Arpaci-Dusseau,et al.  High-performance sorting on networks of workstations , 1997, SIGMOD '97.

[19]  Laurent Amsaleg,et al.  Cost-based query scrambling for initial delays , 1998, SIGMOD '98.

[20]  Michael Stonebraker,et al.  Interoperability, Distributed Applications and Distributed Databases: The Virtual Table Interface , 1998, IEEE Data Eng. Bull..

[21]  D. DeWitt,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998, ACM SIGMOD Conference.

[22]  Joseph M. Hellerstein,et al.  Optimization techniques for queries with expensive methods , 1998, TODS.

[23]  Peter J. Haas,et al.  Interactive data Analysis: The Control Project , 1999, Computer.

[24]  Joseph M. Hellerstein,et al.  Online Dynamic Reordering for Interactive Data Processing , 1999, VLDB.

[25]  Hamid Pirahesh,et al.  Heterogeneous query processing through SQL table functions , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[26]  Kinji Ono,et al.  Cost estimation of user-defined methods in object-relational database systems , 1999, SGMD.

[27]  David E. Culler,et al.  The multispace: an evolutionary platform for infrastructural services , 1999 .

[28]  Richard R. Muntz,et al.  Dynamic query re-optimization , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[29]  Ioana Manolescu,et al.  Query optimization in the presence of limited access patterns , 1999, SIGMOD '99.

[30]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[31]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[32]  Noah Treuhaft,et al.  Cluster I/O with River: making the fast case common , 1999, IOPADS '99.

[33]  Michael J. Franklin,et al.  XJoin: Getting Fast Answers From Slow and Bursty Networks , 1999 .

[34]  Paul M. Aoki How to avoid building DataBlades(R) that know the value of everything and the cost of nothing , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.