Cost-aware DAG scheduling algorithms for minimizing execution cost on cloud resources

Directed acyclic graph (DAG) scheduling is a well-known problem, because a DAG can be used to describe a wide range of complex applications, including scientific applications and parallel computing jobs. Most DAG scheduling algorithms were proposed to minimize the job makespan (i.e., execution time) on a multiprocessor computer or cluster. However, as the cost-driven public cloud services have become an attractive and popular platform for providing computing resources, cost minimization emerges as a new critical issue. Therefore, the objective of this work is to propose and solve the cost optimization problem for scheduling DAGs on an IaaS cloud platform where task scheduling must cope with resource provisioning to achieve the optimal solution. In this paper, we proposed both optimal and heuristic scheduling algorithms, and we evaluated them across a variety of DAGs using the price model from EC2. Comparing to other cost-oblivious DAG schedules that aim to minimize makespan or resource usage, the results show that our cost-aware heuristic algorithm can reduce cost by 20–50 % and achieve a cost within x1.16 of the optimal one.

[1]  Ishfaq Ahmad,et al.  On Exploiting Task Duplication in Parallel Program Scheduling , 1998, IEEE Trans. Parallel Distributed Syst..

[2]  Daniel Gajski,et al.  Hypertool: A Programming Aid for Message-Passing Systems , 1990, IEEE Trans. Parallel Distributed Syst..

[3]  Dharma P. Agrawal,et al.  Improving scheduling of tasks in a heterogeneous environment , 2004, IEEE Transactions on Parallel and Distributed Systems.

[4]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[5]  Philippe Chrétienne,et al.  C.P.M. Scheduling with Small Communication Delays and Task Duplication , 1991, Oper. Res..

[6]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[7]  Bingsheng He,et al.  Transformation-Based Monetary CostOptimizations for Workflows in the Cloud , 2014, IEEE Transactions on Cloud Computing.

[8]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[9]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[10]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[11]  Hamid Arabnejad,et al.  List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table , 2014, IEEE Transactions on Parallel and Distributed Systems.

[12]  Wang Ho Yu,et al.  Lu decomposition on a multiprocessing system with communications delay , 1984 .

[13]  J. Amudhavel,et al.  A heuristic fault tolerant MapReduce framework for minimizing makespan in Hybrid Cloud Environment , 2014, 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE).

[14]  Rajkumar Buyya,et al.  Deadline Based Resource Provisioningand Scheduling Algorithm for Scientific Workflows on Clouds , 2014, IEEE Transactions on Cloud Computing.

[15]  Yeh-Ching Chung,et al.  Improving Static Task Scheduling in Heterogeneous and Homogeneous Computing Systems , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[16]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[17]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[18]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[19]  Emmanuel Jeannot,et al.  Evaluation and Optimization of the Robustness of DAG Schedules in Heterogeneous Environments , 2010, IEEE Transactions on Parallel and Distributed Systems.

[20]  Dewi I. Jones,et al.  Static Scheduling Using Clustering and Task Duplication , 1997 .

[21]  Guochu Chen Simplified particle swarm optimization algorithm based on particles classification , 2010, 2010 Sixth International Conference on Natural Computation.

[22]  V. Zumer,et al.  Evaluation of static program allocation schemes for macro data-flow computer , 1994, Proceedings of Twentieth Euromicro Conference. System Architecture and Integration.

[23]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[25]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[26]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[27]  Kenichi Hagihara,et al.  Near-optimal dynamic task scheduling of precedence constrained coarse-grained tasks onto a computational grid , 2003, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings..

[28]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[29]  T. C. Hu Parallel Sequencing and Assembly Line Problems , 1961 .

[30]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[31]  Jean-Marc Vincent,et al.  Random graph generation for scheduling simulations , 2010, SimuTools.

[32]  Arjan J. C. van Gemund,et al.  Fast and effective task scheduling in heterogeneous systems , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[33]  Behrooz Shirazi,et al.  Analysis and Evaluation of Heuristic Methods for Static Task Scheduling , 1990, J. Parallel Distributed Comput..

[34]  Virgílio A. F. Almeida,et al.  Using random task graphs to investigate the potential benefits of heterogeneity in parallel systems , 1992, Proceedings Supercomputing '92.

[35]  Hans De Sterck,et al.  CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[36]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[37]  Dharma P. Agrawal,et al.  Optimal Scheduling Algorithm for Distributed-Memory Machines , 1998, IEEE Trans. Parallel Distributed Syst..