Evolving multi-objective strategies for task allocation of scientific workflows on public clouds

With the increase in deployment of scientific application on public and private clouds, the allocation of workflow tasks to specific cloud instances to reduce runtime and cost has emerged as an important challenge. The allocation of scientific workflows on public clouds can be described through a variety of perspectives and parameters and has been proved to be NP-complete. This paper presents an optimization framework for task allocation on public clouds. We present a solution that considers important parameters such as workflow runtime, communication overhead, and overall execution cost. Our multi-objective optimization framework builds on a simple and extensible cost model and uses a heuristic to determine the optimal number of cloud instances to be used. Using the Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3) as an example, we show how our optimization heuristics lead to significantly better strategies than other state-of-the-art approaches. Specifically, our single-objective optimization is slightly better than a simple heuristic and a particle swarm optimization approach for small workflows, and achieves significant improvements for larger workflows. In a similar manner, our multi-objective optimization obtains similar results to our single-objective optimization for small-size workflows, and achieves up to 80% improvement for large-size workflows.

[1]  Daniel S. Katz,et al.  A comparison of two methods for building astronomical image mosaics on a grid , 2005, 2005 International Conference on Parallel Processing Workshops (ICPPW'05).

[2]  James Kennedy,et al.  Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.

[3]  Alexandru Iosup,et al.  C-Meter: A Framework for Performance Analysis of Computing Clouds , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[4]  G. Bruce Berriman,et al.  Scientific workflow applications on Amazon EC2 , 2010, 2009 5th IEEE International Conference on E-Science Workshops.

[5]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[6]  Bora Uçar,et al.  Integrated data placement and task assignment for scientific workflows in clouds , 2011, DIDC '11.

[7]  Xiao Liu,et al.  A data placement strategy in scientific cloud workflows , 2010, Future Gener. Comput. Syst..

[8]  Kenjiro Taura,et al.  File-Access Characteristics of Data-Intensive Workflow Applications , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[9]  Xiao Liu,et al.  A Revised Discrete Particle Swarm Optimization for Cloud Workflow Scheduling , 2010, 2010 International Conference on Computational Intelligence and Security.

[10]  Ewa Deelman,et al.  Experiences using cloud computing for a scientific workflow application , 2011, ScienceCloud '11.

[11]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[12]  Rajkumar Buyya,et al.  A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[13]  Edward Walker,et al.  Benchmarking Amazon EC2 for High-Performance Scientific Computing , 2008, login Usenix Mag..

[14]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[15]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[16]  Ishfaq Ahmad,et al.  Efficient Scheduling of Arbitrary TAsk Graphs to Multiprocessors Using a Parallel Genetic Algorithm , 1997, J. Parallel Distributed Comput..

[17]  David Fernández-Baca,et al.  Allocating Modules to Processors in a Distributed System , 1989, IEEE Trans. Software Eng..

[18]  David Abramson,et al.  Nimrod/K: Towards massively parallel dynamic Grid workflows , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[19]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[20]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[21]  Francisco Luna,et al.  jMetal: a Java Framework for Developing Multi-Objective Optimization Metaheuristics , 2006 .