Scheduling DAG-based workflows on single cloud instances: High-performance and cost effectiveness with a static scheduler

The problem of achieving high-performance cost-effectively in cloud computing is challenging when workflows have Directed Acyclic Graph (DAG)-structured inter-task dependencies. We study this problem within single cloud instances and provide empirical evidence that the static Area-Oriented DAG-Scheduling (AO) paradigm, which predetermines the order for executing a DAG’s tasks, provides both high performance and cost effectiveness. AO produces schedules in a platform-oblivious manner; it ignores the performance characteristics of the platform’s resources and focuses only on the dependency structure of the workflow. Specifically, AO’s schedules strive to enhance the rate of rendering tasks eligible for execution. Using an archive of diverse DAG-structured workflows, we experimentally compare AO with a variety of competing DAG-schedulers: (a) the static locally optimal DAG-scheduler (LO), which, like AO, is static and platform-oblivious but chooses its DAG-ordering based on tasks’ outdegrees; and (b) five dynamic versions of static schedulers (including AO and LO), each of which can violate its parent static scheduler’s prescribed task orders to avoid stalling. Our results provide evidence of AO’s supremacy as compared with LO and its essential equivalence to dynamic-AO: neither competitor yields higher performance at an lower cost than AO does. Two aspects of these results are notable. Firstly, AO is platform-oblivious, whereas dynamic-AO is intensely platform-sensitive; one would expect platform sensitivity to enhance performance. Secondly, AO outperforms LO by an order of magnitude, together with lower costs; one would not expect such a performance gap.

[1]  Laxmikant V. Kalé,et al.  CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.

[2]  Arnold L. Rosenberg,et al.  An AREA-Oriented Heuristic for Scheduling DAGs on Volatile Computing Platforms , 2015, IEEE Transactions on Parallel and Distributed Systems.

[3]  Arnold L. Rosenberg,et al.  Applying IC-Scheduling Theory to Familiar Classes of Computations , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[4]  Thomas Stricker,et al.  Performance characterization of a molecular dynamics code on PC clusters: is there any easy parallelism in CHARMM? , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[5]  Arnold L. Rosenberg,et al.  A Tool for Prioritizing DAGMan Jobs and its Evaluation , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[6]  Arnold L. Rosenberg,et al.  A Comparison of Dag-Scheduling Strategies for Internet-Based Computing , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[7]  Arnold L. Rosenberg,et al.  Toward a theory for scheduling dags in Internet-based computing , 2006, IEEE Transactions on Computers.

[8]  Arnold L. Rosenberg,et al.  On scheduling mesh-structured computations for Internet-based computing , 2004, IEEE Transactions on Computers.

[9]  Michela Taufer,et al.  Combining Task- and Data Parallelism to Speed up Protein Folding on a Desktop Grid Platform Is efficient protein folding possible with CHARMM on the United Devices MetaProcessor? , 2002 .

[10]  Arnold L. Rosenberg,et al.  On Scheduling Series-Parallel DAGs to Maximize Area , 2014, Int. J. Found. Comput. Sci..

[11]  Thomas Stricker,et al.  Implementation and characterization of protein folding on a desktop computational grid. Is CHARMM a suitable candidate for the United Devices MetaProcessor? , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[12]  Arnold L. Rosenberg,et al.  On constructing DAG‐schedules with large areas , 2014, Concurr. Comput. Pract. Exp..

[13]  D. R. K. Brownrigg,et al.  The weighted median filter , 1984, CACM.

[14]  Arnold L. Rosenberg,et al.  On scheduling dag s for volatile computing platforms: Area-maximizing schedules , 2012, J. Parallel Distributed Comput..

[15]  Vivek K. Pallipuram,et al.  A Testing Engine for High-Performance and Cost-Effective Workflow Execution in the Cloud , 2015, 2015 44th International Conference on Parallel Processing.

[16]  Valentín Cardeñoso-Payo,et al.  Mapping Unstructured Applications into Nested Parallelism , 2002, VECPAR.