A PTS-PGATS based approach for data-intensive scheduling in data grids

Grid computing is the combination of computer resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.

[1]  Reagan Moore,et al.  MySRB and SRB - components of a Data Grid , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[2]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[3]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[4]  Kenli Li,et al.  A Novel Security-Driven Scheduling Algorithm for Precedence-Constrained Tasks in Heterogeneous Distributed Systems , 2011, IEEE Transactions on Computers.

[5]  Gang Ju,et al.  A parallel genetic algorithm in multi-objective optimization , 2009, 2009 Chinese Control and Decision Conference.

[6]  Ian T. Foster,et al.  The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets , 2000, J. Netw. Comput. Appl..

[7]  Uwe Schwiegelshohn,et al.  Online scheduling in grids , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  Jingjun Zhang,et al.  An Improved Parallel Adaptive Genetic Algorithm Based on Pareto Front for Multi-objective Problems , 2009, 2009 Second International Symposium on Knowledge Acquisition and Modeling.

[9]  M.A.L. Badr,et al.  Distribution system reconfiguration using a modified Tabu Search algorithm , 2010 .

[10]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[11]  John Darlington,et al.  A Standards Based Approach to Job Submission Through Web Services , 2004 .

[12]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[13]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[14]  Kobra Etminani,et al.  A Min-Min Max-Min Selective Algorithm for Grid Task Scheduling , 2007, 2007 3rd IEEE/IFIP International Conference in Central Asia on Internet.

[15]  Quan-Ke Pan,et al.  An effective hybrid tabu search algorithm for multi-objective flexible job-shop scheduling problems , 2010, Comput. Ind. Eng..

[16]  Stéphane Dauzère-Pérès,et al.  An integrated approach for modeling and solving the general multiprocessor job-shop scheduling problem using tabu search , 1997, Ann. Oper. Res..

[17]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[18]  Kenli Li,et al.  Parallelization methods for implementation of discharge simulation along resin insulator surfaces , 2011, Comput. Electr. Eng..

[19]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[20]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[21]  Albert Y. Zomaya,et al.  Performance enhancement through hybrid replication and Genetic Algorithm co-scheduling in data grids , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[22]  Soonwook Hwang,et al.  Improvement of Data Grid's Performance by Combining Job Scheduling with Dynamic Replication Strategy , 2007, Sixth International Conference on Grid and Cooperative Computing (GCC 2007).

[23]  Xue Shengjun,et al.  The Analysis and Research of Parallel Genetic Algorithm , 2008, 2008 4th International Conference on Wireless Communications, Networking and Mobile Computing.

[24]  Cecilia Mascolo,et al.  Predictive Resource Scheduling in Computational Grids , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[25]  Albert Y. Zomaya,et al.  Intelligent scheduling and replication: a synergistic approach , 2009 .

[26]  Srikumar Venugopal,et al.  A Set Coverage-based Mapping Heuristic for Scheduling Distributed Data-Intensive Applications on Global Grids , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[27]  Bu-Sung Lee,et al.  Impact of Parallel Download on Job Scheduling in Data Grid Environment , 2008, 2008 Seventh International Conference on Grid and Cooperative Computing.

[28]  Jon B. Weissman,et al.  A genetic algorithm based approach for scheduling decomposable data grid applications , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..

[29]  Kenichi Hagihara,et al.  A comparison among grid scheduling algorithms for independent coarse-grained tasks , 2004, 2004 International Symposium on Applications and the Internet Workshops. 2004 Workshops..

[30]  Ilias K. Savvas,et al.  Agent-Based Resource Discovery and Selection for Dynamic Grids , 2006, 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE'06).

[31]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[32]  Liu Guangyuan,et al.  A parallel tabu search approach based on genetic crossover operation , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).