Mapping DAG-based applications to multiclusters with background workload

Before an application modelled as a directed acyclic graph (DAG) is executed on a heterogeneous system, a DAG mapping policy is often enacted. After mapping, the tasks (in the DAG-based application) to be executed at each computational resource are determined. The tasks are then sent to the corresponding resources, where they are orchestrated in the pre-designed pattern to complete the work. Most DAG mapping policies in the literature assume that each computational resource is a processing node of a single processor, i.e. the tasks mapped to a resource are to be run in sequence. Our studies demonstrate that if the resource is actually a cluster with multiple processing nodes, this assumption will cause a mis-perception in the tasks' execution time and execution order. This will disturb the pre-designed cooperation among tasks so that the expected performance cannot be achieved. In this paper, a DAG mapping algorithm is presented for multicluster architectures. Each constituent cluster in the multicluster is shared by background workload (from other users) and has its own independent local scheduler. The multicluster DAG mapping policy is based on theoretical analysis and its performance is evaluated through extensive experimental studies. The results show that compared with conventional DAG mapping policies, the new scheme that we present can significantly improve the scheduling performance of a DAG-based application in terms of the schedule length.

[1]  Stephen A. Jarvis,et al.  The impact of predictive inaccuracies on execution scheduling , 2005, Perform. Evaluation.

[2]  Xiao Qin,et al.  Dynamic, reliability-driven scheduling of parallel real-time jobs in heterogeneous systems , 2001, International Conference on Parallel Processing, 2001..

[3]  Stephen A. Jarvis,et al.  Optimising static workload allocation in multiclusters , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[4]  P. V. Ushakumari,et al.  On the queueing system , 1998 .

[5]  Rajkumar Buyya,et al.  Emerging Technologies for MultiCluster/Grid Computing , 2001, CLUSTER.

[6]  Stephen A. Jarvis,et al.  Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[7]  Rastislav Bodík,et al.  Slack: maximizing performance under technological constraints , 2002, ISCA.

[8]  Mary Mehrnoosh Eshaghian-Wilner,et al.  Mapping heterogeneous task graphs onto heterogeneous system graphs , 1997, Proceedings Sixth Heterogeneous Computing Workshop (HCW'97).

[9]  Stephen A. Jarvis,et al.  Dynamic, capability-driven scheduling of DAG-based real-time jobs in heterogeneous clusters , 2004, Int. J. High Perform. Comput. Netw..

[10]  A. Gonzalez,et al.  Graph-partitioning based instruction scheduling for clustered processors , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[11]  Arjan J. C. van Gemund,et al.  Fast and effective task scheduling in heterogeneous systems , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[12]  Füsun Özgüner,et al.  Dynamic, competitive scheduling of multiple DAGs in a distributed heterogeneous environment , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).

[13]  Stephen A. Jarvis,et al.  Dynamic scheduling of parallel real-time jobs by modelling spare capabilities in heterogeneous clusters , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[14]  Scott A. Mahlke,et al.  Region-based hierarchical operation partitioning for multicluster processors , 2003, PLDI '03.

[15]  Abhiram G. Ranade Scheduling loosely connected task graphs , 2003, J. Comput. Syst. Sci..

[16]  Subhash Saini,et al.  GridFlow: workflow management for grid computing , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[17]  Antonio González,et al.  Graph-partitioning based instruction scheduling for clustered processors , 2001, MICRO.