Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid

In this paper, we present an experimental study of deterministic non-preemptive multiple workflow scheduling strategies on a Grid. We distinguish twenty five strategies depending on the type and amount of information they require. We analyze scheduling strategies that consist of two and four stages: labeling, adaptive allocation, prioritization, and parallel machine scheduling. We apply these strategies in the context of executing the Cybershake, Epigenomics, Genome, Inspiral, LIGO, Montage, and SIPHT workflows applications. In order to provide performance comparison, we performed a joint analysis considering three metrics. A case study is given and corresponding results indicate that well known DAG scheduling algorithms designed for single DAG and single machine settings are not well suited for Grid scheduling scenarios, where user run time estimates are available. We show that the proposed new strategies outperform other strategies in terms of approximation factor, mean critical path waiting time, and critical path slowdown. The robustness of these strategies is also discussed.

[1]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[2]  Denis Trystram,et al.  Program Graph Structuring for Execution in Dynamic SMP Clusters Using Moldable Tasks , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[3]  Jorge J. Moré,et al.  Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .

[4]  Y.-K. Kwok,et al.  Static scheduling algorithms for allocating directed task graphs to multiprocessors , 1999, CSUR.

[5]  Uwe Schwiegelshohn,et al.  Job Allocation Strategies with User Run Time Estimates for Online Scheduling in Hierarchical Grids , 2011, Journal of Grid Computing.

[6]  Carl Kesselman,et al.  Optimizing Grid-Based Workflow Execution , 2005, Journal of Grid Computing.

[7]  Denis Trystram,et al.  Analyzing scheduling with transient failures , 2009, Inf. Process. Lett..

[8]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[9]  Todd S. Munson,et al.  Optimality Measures for Performance Profiles , 2006, SIAM J. Optim..

[10]  Carolyn McCreary,et al.  A comparison of heuristics for scheduling DAGs on multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.

[11]  Cynthia Bailey Lee,et al.  Are User Runtime Estimates Inherently Inaccurate? , 2004, JSSPP.

[12]  Rizos Sakellariou,et al.  A hybrid heuristic for DAG scheduling on heterogeneous systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[13]  Pierre-François Dutot,et al.  Scheduling Moldable Tasks for Dynamic SMP Clusters in SoC Technology , 2005, PPAM.

[14]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[15]  Emmanuel Jeannot,et al.  Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems , 2007, SPAA '07.

[16]  Luiz Fernando Bittencourt,et al.  A dynamic approach for scheduling dependent tasks on the Xavantes grid middleware , 2006, MCG '06.

[17]  Marian Bubak,et al.  Investigation of the DAG eligible jobs maximization algorithm in a grid , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[18]  Rizos Sakellariou,et al.  Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[19]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[20]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[21]  Rajkumar Buyya,et al.  GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing , 2002, Concurr. Comput. Pract. Exp..

[22]  Rajkumar Buyya,et al.  A taxonomy of computer‐based simulations and its mapping to parallel and distributed systems simulation tools , 2004, Softw. Pract. Exp..

[23]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[24]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[25]  Radu Prodan,et al.  Scheduling of scientific workflows in the ASKALON grid environment , 2005, SGMD.

[26]  Kuo-Chan Huang,et al.  Online scheduling of workflow applications in grid environments , 2011, Future Gener. Comput. Syst..

[27]  Luiz Fernando Bittencourt,et al.  Towards the Scheduling of Multiple Workflows on Computational Grids , 2010, Journal of Grid Computing.

[28]  Tchimou N'Takpé,et al.  Concurrent scheduling of parallel task graphs on multi-clusters using constrained resource allocations , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[29]  Ronald L. Graham,et al.  Bounds for Multiprocessor Scheduling with Resource Constraints , 1975, SIAM J. Comput..

[30]  Rajkumar Buyya,et al.  Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms , 2006, Sci. Program..

[31]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[32]  Henri Casanova,et al.  Simgrid: a toolkit for the simulation of application scheduling , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[33]  V. Selladurai,et al.  Workflow balancing strategies in parallel machine scheduling , 2004 .

[34]  Emmanuel Jeannot,et al.  Comparative Evaluation Of The Robustness Of DAG Scheduling Heuristics , 2008, CoreGRID Integration Workshop.

[35]  Joseph Y.-T. Leung,et al.  Handbook of Scheduling: Algorithms, Models, and Performance Analysis , 2004 .

[36]  Uwe Schwiegelshohn,et al.  On-line hierarchical job scheduling on grids with admissible allocation , 2010, J. Sched..

[37]  Daniel S. Katz,et al.  Workflow task clustering for best effort systems with Pegasus , 2008, Mardi Gras Conference.

[38]  Kuo-Chan Huang,et al.  Online Scheduling of Workflow Applications in Grid Environment , 2010, GPC.

[39]  Andrei Tchernykh,et al.  A Grid simulation framework to study advance scheduling strategies for complex workflow applications , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[40]  Wei Guo,et al.  Dynamic multi DAG scheduling algorithm for optical grid environment , 2007, SPIE/OSA/IEEE Asia Communications and Photonics.

[41]  David P. Williamson,et al.  Scheduling parallel machines on-line , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.