Generating Query Plans for Distributed Query Processing Using Genetic Algorithm

Query Processing is a key determinant in the overall performance of distributed databases. It requires processing of data at their respective sites and transmission of the same between them. These together constitute a distributed query processing strategy (DQP). DQP aims to arrive at an efficient query processing strategy for a given query. This strategy involves generation of efficient query plans for a distributed query. In case of distributed relational queries, the number of possible query plans grows exponentially with an increase in the number of relations accessed by the query. This number increases further when the relations, accessed by the query, have replicas at different sites. Such a large search space renders it infeasible to find optimal query plans. This paper presents a query plan generation algorithm that attempts to generate optimal query plans, for a given query, using genetic algorithm. The query plans so generated involve fewer sites, thus leading to efficient query processing. Further, experimental results show that the proposed algorithm converges quickly towards optimal query plans for an observed crossover and mutation probability.

[1]  Clement T. Yu,et al.  Distributed query processing a multiple database system , 1989, IEEE J. Sel. Areas Commun..

[2]  Kalyanmoy Deb,et al.  A Comparative Analysis of Selection Schemes Used in Genetic Algorithms , 1990, FOGA.

[3]  Michael C. Ferris,et al.  A Genetic Algorithm for Database Query Optimization , 1991, ICGA.

[4]  Hongbin Dong,et al.  Genetic algorithms for large join query optimization , 2007, GECCO '07.

[5]  Jo-Mei Chang A Heuristic Approach to Distributed Query Processing , 1982, VLDB.

[6]  Werner Bux,et al.  Performance Issues , 1983, Advanced Course: Local Area Networks.

[7]  Masatoshi Yoshikawa,et al.  Query processing for distributed databases using generalized semi-joins , 1982, SIGMOD '82.

[8]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[9]  Clement T. Yu,et al.  Performance Issues in Distributed Query Processing , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Alan R. Hevner,et al.  Query Processing in Distributed Database System , 1979, IEEE Transactions on Software Engineering.

[11]  本田 公男 Query Processing in Distributed Database Systems , 1980 .

[12]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[13]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[14]  Eugene Wong,et al.  Query optimization by simulated annealing , 1987, SIGMOD '87.

[15]  Eugene Wong,et al.  A state transition model for distributed query processing , 1986, TODS.

[16]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[17]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[18]  Masatoshi Yoshikawa,et al.  Query processing utilizing dependencies and horizontal decomposition , 1983, SIGMOD '83.

[19]  M. Gregory Genetic algorithm optimisation of distributed database queries , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).

[20]  Sangkyu Rho,et al.  Optimizing distributed join queries: A genetic algorithm approach , 1997, Ann. Oper. Res..

[21]  Clement T. Yu,et al.  Distributed query processing , 1984, CSUR.

[22]  Vikram Singh,et al.  Distributed Query Processing Plans Generationusing Genetic Algorithm , 2011 .

[23]  A threshold mechanism for distributed query processing , 1988, CSC '88.

[24]  Wesley W. Chu,et al.  Optimal Query Processing for Distributed Database Systems , 1982, IEEE Transactions on Computers.