Evolutionary Algorithms for Allocating Data in Distributed Database Systems

A major cost in executing queries in a distributed database system is the data transfer cost incurred in transferring relations (fragments) accessed by a query from different sites to the site where the query is initiated. The objective of a data allocation algorithm is to determine an assignment of fragments at different sites so as to minimize the total data transfer cost incurred in executing a set of queries. This is equivalent to minimizing the average query execution time, which is of primary importance in a wide class of distributed conventional as well as multimedia database systems. The data allocation problem, however, is NP-complete, and thus requires fast heuristics to generate efficient solutions. Furthermore, the optimal allocation of database objects highly depends on the query execution strategy employed by a distributed database system, and the given query execution strategy usually assumes an allocation of the fragments. We develop a site-independent fragment dependency graph representation to model the dependencies among the fragments accessed by a query, and use it to formulate and tackle data allocation problems for distributed database systems based on query-site and move-small query execution strategies. We have designed and evaluated evolutionary algorithms for data allocation for distributed database systems.

[1]  S. Hurley,et al.  Taskgraph Mapping Using a Genetic Algorithm: A Comparison of Fitness Functions , 1993, Parallel Comput..

[2]  J. D. Uiiman,et al.  Principles of Database Systems , 2004, PODS 2004.

[3]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[4]  Ishfaq Ahmad,et al.  Design and Evaluation of Data Allocation Algorithms for Distributed Multimedia Database Systems , 1996, IEEE J. Sel. Areas Commun..

[5]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[6]  Shamkant B. Navathe,et al.  Scheduling data redistribution in distributed databases , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[7]  Carsten Peterson,et al.  A New Method for Mapping Optimization Problems Onto Neural Networks , 1989, Int. J. Neural Syst..

[8]  Kapali P. Eswaran Placement of Records in a File and File Allocation in a Computer , 1974, IFIP Congress.

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  M. Tamer Özsu,et al.  An object-oriented multimedia database system for a news-on-demand application , 1995, Multimedia Systems.

[11]  Arif Ghafoor,et al.  Synchronization and Storage Models for Multimedia Objects , 1990, IEEE J. Sel. Areas Commun..

[12]  Lothar F. Mackert Architecture of Distributed Multimedia Systems , 1993 .

[13]  David J. Evans,et al.  The Annealing Evolution Algorithm as Function Optimizer , 1995, Parallel Comput..

[14]  P B Berra,et al.  Architecture for distributed multimedia database systems , 1990, Comput. Commun..

[15]  Hasan Pirkul,et al.  Computer and Database Location in Distributed Computer Systems , 1986, IEEE Transactions on Computers.

[16]  Philip S. Yu,et al.  Site assignment for relations and joint operations in the distributed transaction processing environment , 1988, Proceedings. Fourth International Conference on Data Engineering.

[17]  Hava T. Siegelmann,et al.  Multiprocessor Document Allocation: A Genetic Algorithm Approach , 1997, IEEE Trans. Knowl. Data Eng..

[18]  Kamalakar Karlapalem,et al.  Query-Driven Data Allocation Algorithms for Distributed Database Systems , 1997, DEXA.

[19]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[20]  David E. van den Bout,et al.  Graph partitioning using annealed neural networks , 1990, International 1989 Joint Conference on Neural Networks.

[21]  Peter M G Apers,et al.  Data allocation in distributed database systems , 1988, TODS.

[22]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[23]  Arif Ghafoor Multimedia database management systems , 1995, CSUR.

[24]  Tevfik Bultan,et al.  A New Mapping Heuristic Based on Mean Field Annealing , 1992, J. Parallel Distributed Comput..

[25]  R. G. Casey,et al.  Allocation of copies of a file in an information network , 1899, AFIPS '72 (Spring).

[26]  D. E. Van den Bout,et al.  Improving the performance of the Hopfield-Tank neural network through normalization and annealing , 1989, Biological Cybernetics.

[27]  Sudha Ram,et al.  A Model for Database Allocation Incorporating a Concurrency Control Mechanism , 1991, IEEE Trans. Knowl. Data Eng..

[28]  Erich J. Neuhold,et al.  Multimedia Database Systems - The Notion and the Issues , 1995, BTW.

[29]  David E. Goldberg,et al.  Parallel Recombinative Simulated Annealing: A Genetic Algorithm , 1995, Parallel Comput..

[30]  Stefano Ceri,et al.  Distributed Databases: Principles and Systems , 1984 .

[31]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[32]  Wesley W. Chu,et al.  Optimal File Allocation in a Multiple Computer System , 1969, IEEE Transactions on Computers.