Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments

Abstract Data centers for cloud computing must accommodate numerous parallel task executions simultaneously. Therefore, data centers have many virtual machines (VMs). Minimizing the scheduling length of parallel task sets becomes a critical requirement in cloud computing systems. In this study, we propose an efficient priority and relative distance (EPRD) algorithm to minimize the task scheduling length for precedence constrained workflow applications without violating the end-to-end deadline constraint. This algorithm consists of two processes. First, a task priority queue is established. Then, a VM is mapped for a task in accordance with its relative distance. The proposed method can effectively improve VM utilization and scheduling performance. Extensive rigorous experiments based on randomly generated and real-world workflow applications demonstrate that the resource reduction rate and scheduling length of the EPRD algorithm significantly surpass those of existing algorithms.

[1]  Radu Prodan,et al.  Multi-objective energy-efficient workflow scheduling using list-based heuristics , 2014, Future Gener. Comput. Syst..

[2]  Richi Nayak,et al.  Parallel K-Tree: A multicore, multinode solution to extreme clustering , 2019, Future Gener. Comput. Syst..

[3]  Jie Cao,et al.  Improving task scheduling with parallelism awareness in heterogeneous computational environments , 2019, Future Gener. Comput. Syst..

[4]  Jan Broeckhove,et al.  Online cost-efficient scheduling of deadline-constrained workloads on hybrid clouds , 2013, Future Gener. Comput. Syst..

[5]  Kenli Li,et al.  A novel task scheduling scheme in a cloud computing environment using hybrid biogeography-based optimization , 2019, Soft Comput..

[6]  Kenli Li,et al.  A Hybrid Chemical Reaction Optimization Scheme for Task Scheduling on Heterogeneous Computing Systems , 2015, IEEE Transactions on Parallel and Distributed Systems.

[7]  Xiaojun Zhai,et al.  Contention & Energy-Aware Real-Time Task Mapping on NoC Based Heterogeneous MPSoCs , 2018, IEEE Access.

[8]  Basit Qureshi,et al.  Profile-based power-aware workflow scheduling framework for energy-efficient data centers , 2019, Future Gener. Comput. Syst..

[9]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[10]  MengChu Zhou,et al.  Dynamic Cloud Task Scheduling Based on a Two-Stage Strategy , 2018, IEEE Transactions on Automation Science and Engineering.

[11]  Kenli Li,et al.  FlinkCL: An OpenCL-Based In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data , 2018, IEEE Transactions on Computers.

[12]  Kenli Li,et al.  GFlink: An In-Memory Computing Architecture on Heterogeneous CPU-GPU Clusters for Big Data , 2016, IEEE Transactions on Parallel and Distributed Systems.

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Albert Y. Zomaya,et al.  Energy and communication aware task mapping for MPSoCs , 2018, J. Parallel Distributed Comput..

[15]  Xiaomin Zhu,et al.  Scheduling for Workflows with Security-Sensitive Intermediate Data by Selective Tasks Duplication in Clouds , 2017, IEEE Transactions on Parallel and Distributed Systems.

[16]  Hao Wu,et al.  Resource and Instance Hour Minimization for Deadline Constrained DAG Applications Using Computer Clouds , 2016, IEEE Transactions on Parallel and Distributed Systems.

[17]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[18]  Albert Y. Zomaya,et al.  Author manuscript, published in "Journal of Parallel and Distributed Computing (2011)" A Parallel Bi-objective Hybrid Metaheuristic for Energy-aware Scheduling for Cloud Computing Systems , 2011 .

[19]  Kenli Li,et al.  Bi-objective workflow scheduling of the energy consumption and reliability in heterogeneous computing systems , 2017, Inf. Sci..

[20]  Laurence T. Yang,et al.  Task aware hybrid DVFS for multi-core real-time systems using machine learning , 2017, Inf. Sci..

[21]  Vijayan Sugumaran,et al.  Task scheduling techniques in cloud computing: A literature survey , 2019, Future Gener. Comput. Syst..

[22]  Keqin Li,et al.  Scheduling parallel tasks with energy and time constraints on multiple manycore processors in a cloud computing environment , 2017, Future Gener. Comput. Syst..

[23]  Philip S. Yu,et al.  A Periodicity-based Parallel Time Series Prediction Algorithm in Cloud Computing Environments , 2018, Inf. Sci..

[24]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[25]  Xin Huang,et al.  Novel heuristic speculative execution strategies in heterogeneous distributed environments , 2016, Comput. Electr. Eng..

[26]  Philip S. Yu,et al.  A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks , 2018, IEEE Transactions on Parallel and Distributed Systems.

[27]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[28]  Hua Peng,et al.  Joint optimization method for task scheduling time and energy consumption in mobile cloud computing environment , 2019, Appl. Soft Comput..

[29]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[30]  Lavanya Ramakrishnan,et al.  Deadline-sensitive workflow orchestration without explicit resource control , 2011, J. Parallel Distributed Comput..

[31]  Kenli Li,et al.  A Novel Security-Driven Scheduling Algorithm for Precedence-Constrained Tasks in Heterogeneous Distributed Systems , 2011, IEEE Transactions on Computers.

[32]  Tao Li,et al.  CASpMV: A Customized and Accelerative SpMV Framework for the Sunway TaihuLight , 2021, IEEE Transactions on Parallel and Distributed Systems.

[33]  Kenli Li,et al.  Contention-Aware Reliability Efficient Scheduling on Heterogeneous Computing Systems , 2018, IEEE Transactions on Sustainable Computing.

[34]  Cevdet Aykanat,et al.  Locality-aware and load-balanced static task scheduling for MapReduce , 2019, Future Gener. Comput. Syst..

[35]  Kenli Li,et al.  Performance-Aware Model for Sparse Matrix-Matrix Multiplication on the Sunway TaihuLight Supercomputer , 2019, IEEE Transactions on Parallel and Distributed Systems.