Query optimization in distributed networks of autonomous database systems

Large-scale distributed environments, where each node is completely autonomous and offers services to its peers through external communication, pose significant challenges to query processing and optimization. Autonomy is the main source of the problem, as it results in lack of knowledge about any particular node with respect to the information it can produce and its characteristics, for example, cost of production or quality of produced results. In this article, inspired by e-commerce technology, we recognize queries as commodities and model query optimization as a trading negotiation process. Subquery answers and subquery operator execution jobs are traded between nodes until deals are struck with some nodes for all of them. Such trading may also occur recursively, in the sense that some nodes may play the role of intermediaries between other nodes (subcontracting). We identify the key parameters of the overall framework and suggest several potential alternatives for each one. In comparison to trading negotiations for e-commerce, query optimization faces unique new challenges that stem primarily from the fact that queries have a complex structure and can be broken into smaller parts. We address these challenges through a particular instantiation of our framework focusing primarily on the optimization algorithms run on “buying” and “selling” nodes, the evaluation metrics of the queries, and the negotiation strategy. Finally, we present the results of several experiments that demonstrate the performance characteristics of our approach compared to those of traditional query optimization.

[1]  Michael Stonebraker,et al.  Mariposa: a new architecture for distributed data , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[2]  Björn Þór Jónsson,et al.  Performance tradeoffs for client-server query processing , 1996, SIGMOD '96.

[3]  Alon Y. Halevy,et al.  Answering queries using views: A survey , 2001, The VLDB Journal.

[4]  Vincent Conitzer,et al.  Complexity Results about Nash Equilibria , 2002, IJCAI.

[5]  Reid G. Smith,et al.  The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver , 1980, IEEE Transactions on Computers.

[6]  HalevyAlon,et al.  MiniCon: A scalable algorithm for answering queries using views , 2001, VLDB 2001.

[7]  Stamatis Vassiliadis,et al.  A peer-to-peer agent auction , 2002, AAMAS '02.

[8]  Joseph M. Hellerstein,et al.  Decoupled query optimization for federated database systems , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  Tuomas Sandholm,et al.  Algorithm for optimal winner determination in combinatorial auctions , 2002, Artif. Intell..

[10]  Michael J. Wynblatt,et al.  The network is the database: data management for highly distributed systems , 2001, SIGMOD '01.

[11]  Elke A. Rundensteiner,et al.  Revisiting Pipelined Parallelism in Multi-Join Query Processing , 2005, VLDB.

[12]  Jeff Sidell The Mariposa distributed database management system , 1996, SGMD.

[13]  Sarit Kraus,et al.  Strategic Negotiation in Multiagent Environments , 2001, Intelligent robots and autonomous agents.

[14]  Jeffrey S. Rosenschein,et al.  Rules of Encounter - Designing Conventions for Automated Negotiation among Computers , 1994 .

[15]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[16]  Hamid Pirahesh,et al.  Answering complex SQL queries using automatic summary tables , 2000, SIGMOD '00.

[17]  Mihalis Yannakakis,et al.  Multiobjective query optimization , 2001, PODS '01.

[18]  Alon Y. Halevy,et al.  MiniCon: A scalable algorithm for answering queries using views , 2000, The VLDB Journal.

[19]  Yannis E. Ioannidis,et al.  Distributed Query Optimization by Query Trading , 2004, EDBT.

[20]  Julita Vassileva,et al.  An extended alternating-offers bargaining protocol for automated negotiation in multi-agent systems , 2002, OTM.

[21]  A. Mas-Colell,et al.  Microeconomic Theory , 1995 .

[22]  Donald Kossmann,et al.  Iterative dynamic programming: a new class of query optimization algorithms , 2000, TODS.

[23]  Hamid Pirahesh,et al.  Answering complex SQL queries using automatic summary tables , 2000, SIGMOD 2000.

[24]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[25]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[26]  Herman Lam,et al.  An Internet-based negotiation server for e-commerce , 2001, The VLDB Journal.

[27]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[28]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[29]  Elke A. Rundensteiner,et al.  Revisiting the Role of Pipelined Parallelism in Multi-Join Query Processing , 2005 .

[30]  H. V. Parunak Chapter 10 – Manufacturing Experience with the Contract Net , 1987 .

[31]  IoannidisYannis,et al.  Query optimization in distributed networks of autonomous database systems , 2006 .

[32]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[33]  J. Kagel,et al.  Auctions: A Survey of Experimental Research, 1995 - 2008* , 2008 .

[34]  Maria L. Gini,et al.  Evaluating risk: flexibility and feasibility in multi-agent contracting , 1999, AGENTS '99.

[35]  A. Segev,et al.  Multi-attribute Auctions for Electronic Procurement , 1999 .

[36]  Timos K. Sellis,et al.  Parametric query optimization , 1992, The VLDB Journal.

[37]  Gautam Jain Query Optimization for Parallel Execution , 2007 .