Improved genetic algorithm for scheduling divisible data grid application

Data grid technology promises geographically distributed scientists to access and share physically distributed resources such as computing resources, networks, storages, and most importantly data collections for large scale data intensive problems. In many data grid applications, data can be decomposed into multiple independent sub datasets and distributed for parallel execution and analysis. In this paper, we exploit this property and propose an Improved genetic algorithm (IGA) for scheduling divisible data grid applications. A good heuristic approach used to generate the initial population. Experimental results show that the proposed IGA gives better performance compared to the genetic algorithm (GA).

[1]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[2]  K.Holtman,et al.  CMS Requirements for the Grid , 2001 .

[3]  Rajkumar Buyya,et al.  Nature's heuristics for scheduling jobs on Computational Grids , 2000 .

[4]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[5]  Melanie Mitchell,et al.  An introduction to genetic algorithms , 1996 .

[6]  Rajkumar Buyya,et al.  A taxonomy of Data Grids for distributed data sharing, management, and processing , 2005, CSUR.

[7]  Dantong Yu,et al.  Data Intensive Grid Scheduling: Multiple Sources with Capacity Constraints , 2003 .

[8]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[9]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.

[10]  Jon B. Weissman,et al.  A genetic algorithm based approach for scheduling decomposable data grid applications , 2004 .

[11]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[12]  Sang-Min Park,et al.  Chameleon: a resource scheduler in a data grid environment , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[13]  Fabrizio Silvestri,et al.  Scheduling High Performance Data Mining Tasks on a Data Grid Environment , 2002, Euro-Par.