Generating Distributed Query Processing Plans Using Genetic Algorithm

Distributed query processing has become essential in today’s scenario to address the changing business needs of users. It aims to arrive at an optimal query processing plan for a given distributed query. This is a complex process as the number of possible query processing plans grows rapidly with increase in the number of sites used, and relations accessed, by the query. Therefore, there is a need to determine optimal query processing plans among all possible plans. The approach presented in this paper attempts to generate such optimal query processing plans using genetic algorithm. As per the approach, the query plans having the required data residing close to each other are considered more efficient and, therefore, are generated. These generated query plans would result in efficient query processing. Further, experimental results show that the approach is able to generate such optimal query processing plans in a fewer number of generations.

[1]  Peter Bodorik,et al.  Distributed query processing optimization objectives , 1988, Proceedings. Fourth International Conference on Data Engineering.

[2]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[3]  Myra Spiliopoulou,et al.  Genetic programming in database query optimization , 1996 .

[4]  Sangkyu Rho,et al.  Optimizing distributed join queries: A genetic algorithm approach , 1997, Ann. Oper. Res..

[5]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[6]  Clement T. Yu,et al.  Distributed query processing , 1984, CSUR.

[7]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[8]  Yannis E. Ioannidis,et al.  Left-deep vs. bushy trees: an analysis of strategy spaces and its implications for query optimization , 1991, SIGMOD '91.

[9]  Clement T. Yu,et al.  Performance Issues in Distributed Query Processing , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Matthias Jarke,et al.  Query Optimization in Database Systems , 1984, CSUR.

[11]  Dorothea Heiss-Czedik,et al.  An Introduction to Genetic Algorithms. , 1997, Artificial Life.

[12]  Hongbin Dong,et al.  Genetic algorithms for large join query optimization , 2007, GECCO '07.

[13]  L. Darrell Whitley,et al.  Unbiased tournament selection , 2005, GECCO '05.

[14]  Kalyanmoy Deb,et al.  Understanding Interactions among Genetic Algorithm Parameters , 1998, FOGA.

[15]  Michael C. Ferris,et al.  A Genetic Algorithm for Database Query Optimization , 1991, ICGA.

[16]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[17]  Arun N. Swami,et al.  Optimization of large join queries , 1988, SIGMOD '88.

[18]  Eugene Wong,et al.  Query optimization by simulated annealing , 1987, SIGMOD '87.

[19]  Peng Wen-bin Genetic Algorithm Optimisation of Distributed Database Queries , 2006 .

[20]  Vasundhara Unnava,et al.  Query processing in distributed database systems , 1992 .

[21]  Masatoshi Yoshikawa,et al.  Query processing for distributed databases using generalized semi-joins , 1982, SIGMOD '82.

[22]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[23]  Alan R. Hevner,et al.  Query Processing in Distributed Database System , 1979, IEEE Transactions on Software Engineering.