A Responsive Knapsack-Based Algorithm for Resource Provisioning and Scheduling of Scientific Workflows in Clouds

Scientific workflows are used to process vast amounts of data and to conduct large-scale experiments and simulations. They are time consuming and resource intensive applications that benefit from running in distributed platforms. In particular, scientific workflows can greatly leverage the ease-of-access, affordability, and scalability offered by cloud computing. To achieve this, innovative and efficient ways of orchestrating the workflow tasks and managing the compute resources in a cost-conscious manner need to be developed. We propose an adaptive, resource provisioning and scheduling algorithm for scientific workflows deployed in Infrastructure as a Service clouds. Our algorithm was designed to address challenges specific to clouds such as the pay-as-you-go model, the performance variation of resources and the on-demand access to unlimited, heterogeneous virtual machines. It is capable of responding to the dynamics of the cloud infrastructure and is successful in generating efficient solutions that meet a user-defined deadline and minimise the overall cost of the used infrastructure. Our simulation experiments demonstrate that it performs better than other state-of-the-art algorithms.

[1]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[2]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[3]  Li-zhen Cui,et al.  A Multiple QoS Constrained Scheduling Strategy of Multiple Workflows for Cloud Computing , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[4]  Ralph E. Gomory,et al.  The Theory and Computation of Knapsack Functions , 1966, Oper. Res..

[5]  Bertrand Granado,et al.  Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments , 2013, TheScientificWorldJournal.

[6]  Marta Mattoso,et al.  A Provenance-based Adaptive Scheduling Heuristic for Parallel Scientific Workflows in Clouds , 2012, Journal of Grid Computing.

[7]  Tram Truong Huu,et al.  Virtual Resources Allocation for Workflow-Based Applications Distribution on a Cloud Infrastructure , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[8]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[9]  Jin-Soo Kim,et al.  Cost optimized provisioning of elastic resources for application workflows , 2011, Future Gener. Comput. Syst..

[10]  Ming Mao,et al.  A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[11]  Ralph E. Gomory,et al.  A Linear Programming Approach to the Cutting Stock Problem---Part II , 1963 .

[12]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  Sanjay V. Rajopadhye,et al.  A sparse knapsack algo-tech-cuit and its synthesis , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[14]  Rajkumar Buyya,et al.  Meeting Deadlines of Scientific Workflows in Public Clouds with Tasks Replication , 2014, IEEE Transactions on Parallel and Distributed Systems.

[15]  Bingsheng He,et al.  Monetary Cost Optimizations for Hosting Workflow-as-a-Service in IaaS Clouds , 2013, IEEE Transactions on Cloud Computing.

[16]  Rajkumar Buyya,et al.  Fault-tolerant Workflow Scheduling using Spot Instances on Clouds , 2014, ICCS.

[17]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[18]  Xiao Liu,et al.  A Revised Discrete Particle Swarm Optimization for Cloud Workflow Scheduling , 2010, 2010 International Conference on Computational Intelligence and Security.

[19]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[20]  Sanjay V. Rajopadhye,et al.  Unbounded knapsack problem: Dynamic programming revisited , 2000, Eur. J. Oper. Res..

[21]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..