A Dynamic Query Processing Architecture for Data Integration Systems

As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously adaptive query engine suitable for global-area systems, massive parallelism, and sensor networks. To set the stage for our research, we present a survey of prior work on adaptive query processing, focusing on three characterizations of adaptivity: the frequency of adaptivity, the effects of adaptivity, and the extent of adaptivity. Given this survey, we sketch directions for research in the Telegraph project.

[1]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[2]  KabraNavin,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998 .

[3]  Elisa Bertino,et al.  Development of a Multimedia Information System for an Office Environment , 1984, VLDB.

[4]  A. N. Wilschut,et al.  Dataflow query execution in a parallel main-memory environment , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[5]  Marc Friedman Daniel S. Weld E ciently Executing Information-Gathering Plans , 1997 .

[6]  Zachary G. Ives,et al.  EÆcient Evaluation of Regular Path Expressions on Streaming XML Data , 2000 .

[7]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[8]  Alin Deutsch,et al.  A Query Language for XML , 1999, Comput. Networks.

[9]  Michael Stonebraker,et al.  Optimization of parallel query execution plans in XPRS , 1991, [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems.

[10]  Deborah Estrin,et al.  Scalable Coordination in Sensor Networks , 1999, MobiCom 1999.

[11]  Mohamed Ziauddin,et al.  Query processing and optimization in Oracle Rdb , 1996, The VLDB Journal.

[12]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[13]  Yannis E. Ioannidis,et al.  Left-deep vs. bushy trees: an analysis of strategy spaces and its implications for query optimization , 1991, SIGMOD '91.

[14]  Guy M. Lohman,et al.  Is query optimization a 'solved' problem? , 1989 .

[15]  Eugene Wong,et al.  Decomposition—a strategy for query processing , 1976, TODS.

[16]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[17]  Laurent Amsaleg,et al.  Scrambling query plans to cope with unexpected delays , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[18]  David Schach,et al.  XML Query Language (XQL) , 1998, QL.

[19]  Alon Y. Halevy,et al.  An adaptive query execution system for data integration , 1999, SIGMOD '99.

[20]  David Lindley,et al.  Bayesian Statistics, a Review , 1987 .

[21]  Daniela Florescu,et al.  Quilt: An XML Query Language for Heterogeneous Data Sources , 2000, WebDB.

[22]  David J. DeWitt,et al.  Benchmarking Database Systems A Systematic Approach , 1983, VLDB.

[23]  David J. DeWitt,et al.  Architecting a Network Query Engine for Producing Partial Results , 2000, WebDB.

[24]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[25]  GraefeGoetz Query evaluation techniques for large databases , 1993 .

[26]  Arnaud Sahuguet,et al.  Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F , 1999, VLDB.

[27]  Andrea C. Arpaci-Dusseau,et al.  High-performance sorting on networks of workstations , 1997, SIGMOD '97.

[28]  William W. Cohen Integration of heterogeneous databases without common domains using queries based on textual similarity , 1998, SIGMOD '98.

[29]  Luc Bouganim,et al.  Dynamic query scheduling in data integration systems , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[30]  Luc Bouganim,et al.  Dynamic Load Balancing in Hierarchical Parallel Database Systems , 1996, VLDB.

[31]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[32]  Michael J. Franklin,et al.  XJoin: Getting Fast Answers From Slow and Bursty Networks , 1999 .

[33]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[34]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[35]  Michael V. Mannino,et al.  Statistical profile estimation in database systems , 1988, CSUR.

[36]  Hansjörg Zeller,et al.  An Adaptive Hash Join Algorithm for Multiuser Environments , 1990, VLDB.

[37]  C. Mohan,et al.  Interactions between query optimization and concurrency control , 1992, [1992 Proceedings] Second International Workshop on Research Issues on Data Engineering: Transaction and Query Processing.

[38]  Praveen Seshadri,et al.  PREDATOR: an OR-DBMS with enhanced data types , 1997, SIGMOD '97.

[39]  Wendi B. Heinzelman,et al.  Adaptive protocols for information dissemination in wireless sensor networks , 1999, MobiCom.

[40]  Per-Åke Larson,et al.  Dynamic Memory Adjustment for External Mergesort , 1997, VLDB.

[41]  Goetz Graefe,et al.  Optimization of dynamic query evaluation plans , 1994, SIGMOD '94.

[42]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[43]  Eduardo D. Sontag,et al.  Mathematical Control Theory: Deterministic Finite Dimensional Systems , 1990 .

[44]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[45]  TWO-WEEK Loan COpy,et al.  University of California , 1886, The American journal of dental science.

[46]  Hamid Pirahesh,et al.  Parallelism in Relational Database Management Systems , 1994, IBM Syst. J..

[47]  Michael Stonebraker,et al.  The design and implementation of INGRES , 1976, TODS.

[48]  Helen J. Wang,et al.  Online aggregation , 1997, SIGMOD '97.

[49]  Jane W.-S. Liu,et al.  APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers , 1993, IEEE Trans. Knowl. Data Eng..

[50]  Calton Pu,et al.  A dynamic query scheduling framework for distributed and evolving information systems , 1997, Proceedings of 17th International Conference on Distributed Computing Systems.

[51]  Bruce G. Lindsay,et al.  Approximate medians and other quantiles in one pass and with limited memory , 1998, SIGMOD '98.

[52]  Daniel P. Miranker,et al.  Processing queries for first-few answers , 1996, CIKM '96.

[53]  Subbarao Kambhampati,et al.  Efficiently Executing Information Gathering Plans , 1998 .

[54]  Howard Raiffa,et al.  Applied Statistical Decision Theory. , 1961 .

[55]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[56]  Matthias Jarke,et al.  VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, August 25-29, 1997, Athens, Greece , 1997 .

[57]  Karen Ward,et al.  Dynamic query evaluation plans , 1989, SIGMOD '89.

[58]  S WeldDaniel,et al.  An adaptive query execution system for data integration , 1999 .

[59]  William E. Weihl,et al.  Lottery scheduling: flexible proportional-share resource management , 1994, OSDI '94.

[60]  Stephen Fox,et al.  Heterogeneous distributed database systems for production use , 1990, ACM Comput. Surv..

[61]  Richard R. Muntz,et al.  Dynamic query re-optimization , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[62]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[63]  David J. DeWitt,et al.  Memory allocation strategies for complex decision support queries , 1998, CIKM '98.

[64]  Miron Livny,et al.  Memory-Adaptive External Sorting , 1993, VLDB.

[65]  Peter J. Haas,et al.  The New Jersey Data Reduction Report , 1997 .

[66]  Masaya Nakayama,et al.  Hash-Partitioned Join Method Using Dynamic Destaging Strategy , 1988, VLDB.

[67]  Kinji Ono,et al.  Cost estimation of user-defined methods in object-relational database systems , 1999, SGMD.

[68]  Asuman Dogac,et al.  Dynamic query optimization on a distributed object management platform , 1996, CIKM '96.

[69]  Goetz Graefe,et al.  Memory-Contention Responsive Hash Joins , 1994, VLDB.

[70]  Peter J. Haas,et al.  Ripple joins for online aggregation , 1999, SIGMOD '99.

[71]  Nick Roussopoulos,et al.  Adaptive selectivity estimation using query feedback , 1994, SIGMOD '94.

[72]  Patrick Valduriez,et al.  Principles of distributed database systems (2nd ed.) , 1999 .

[73]  Peter J. Haas,et al.  Interactive data Analysis: The Control Project , 1999, Computer.

[74]  Michael J. Carey,et al.  On saying “Enough already!” in SQL , 1997, SIGMOD '97.

[75]  Peter M. G. Apers,et al.  Parallel evaluation of multi-join queries , 1995, SIGMOD '95.

[76]  K. Selçuk Candan,et al.  Query caching and optimization in distributed mediator systems , 1996, SIGMOD '96.

[77]  Laurent Amsaleg,et al.  Cost-based query scrambling for initial delays , 1998, SIGMOD '98.

[78]  R. V. Meter Observing the effects of multi-zone disks , 1997 .

[79]  Joseph M. Hellerstein,et al.  Online Dynamic Reordering for Interactive Data Processing , 1999, VLDB.

[80]  Noah Treuhaft,et al.  Cluster I/O with River: making the fast case common , 1999, IOPADS '99.

[81]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses with CD Rom , 1998 .

[82]  Kian-Lee Tan,et al.  Multi-Join Optimization for Symmetric Multiprocessors , 1993, VLDB.

[83]  Joseph M. Hellerstein,et al.  Eddies:Continuous Query Optimization , 1999, SIGMOD 2000.

[84]  Joseph Y. Halpern,et al.  Least expected cost query optimization: an exercise in utility , 1999, PODS.

[85]  David J. DeWitt,et al.  Efficient mid-query re-optimization of sub-optimal query execution plans , 1998, SIGMOD '98.

[86]  Patrick Valduriez,et al.  Memory-adaptive scheduling for large query execution , 1998, CIKM '98.

[87]  Paul M. Aoki How to avoid building DataBlades(R) that know the value of everything and the cost of nothing , 1999, Proceedings. Eleventh International Conference on Scientific and Statistical Database Management.

[88]  Calton Pu,et al.  XWRAP: an XML-enabled wrapper construction system for Web information sources , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).