A Framework for Design of Partially Replicated Distributed Database Systems with Migration Based Genetic Algorithms

For partially replicated distributed database systems to function efficiently, the data (relations) and operations (subquery) of the database need to be located, judiciously at various sites across the relevant communications network.The problem of allocating relations and operations to the most appropriate sites is a difficult one to solve so that genetic algorithms based on migration are proposed in this research. In partially replicated distributed database systems, the minimization of total time usually attempts to minimize resource consumption and therefore to maximize the system throughput. On the other hand, the minimization of response time may be obtained by having a large number of parallel executions to different sites, requiring a higher resource consumption, which means that the system throughput is reduced. Workload balancing implies the reduction of the average time that queries spend waiting for CPU and I/O service at a network site, but its effect on the performance of partially replicated distributed database systems cannot be isolated from other distributed database design factors. In this research, the total cost refers to the combination of total time and response time. This paper presents a framework for total cost minimization and workload balancing for partially replicated distributed database systems considering important database design objectives together. The framework incorporates both local processing, including CPU and I/O, and communication costs. To illustrate its suitability, experiments are conducted, and results demonstrate that the proposed framework provides effective partially replicated distributed database design.

[1]  Muthu Ramachandran,et al.  A high-performance computing method for data allocation in distributed database systems , 2006, The Journal of Supercomputing.

[2]  Narasimhaiah Gorla,et al.  A Genetic Algorithm for Vertical Fragmentation and Access Path Selection , 2000, Comput. J..

[3]  Jesper M. Johansson,et al.  Modeling Network Latency and Parallel Processing in Distributed Database Design , 2003, Decis. Sci..

[4]  Kam-Fai Wong,et al.  A genetic algorithm-based clustering approach for database partitioning , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[5]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[6]  Ahmet Cosar,et al.  An evolutionary genetic algorithm for optimization of distributed database queries , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[7]  A. Kumar Verma,et al.  Reliability-based optimal task-allocation in distributed-database management systems , 1997 .

[8]  Sukkyu Song Design of distributed database systems: an iterative genetic algorithm , 2013, Journal of Intelligent Information Systems.

[9]  Bharadwaj Veeravalli,et al.  Practically Realizable Efficient Data Allocation and Replication Strategies for Distributed Databases with Buffer Constraints , 2006, IEEE Transactions on Parallel and Distributed Systems.

[10]  Dr. Zbigniew Michalewicz,et al.  How to Solve It: Modern Heuristics , 2004 .

[11]  Jun Du,et al.  Genetic algorithms based approach to database vertical partition , 2006, Journal of Intelligent Information Systems.