The Case for Resource Sharing in Scientific Workflow Executions

Scientific workflows have become mainstream for conducting largescale scientific research. The execution of these applications can be very costly in terms of computational resources. Therefore, optimizing their resource utilization and efficiency is highly desirable, even in computational environments where the processing resources are plentiful, such as clouds. In this work, we study the case of exploring shared multiprocessors within a single virtual machine. Using a public cloud provider and real-world applications, we show that the use of dedicated processors can lead to sub-optimal performance of scientific workflows. This is a first step towards the creation of a self-aware resource management system inline with the state-of-the-art multitenant platforms.

[1]  Douglas Thain,et al.  Practical Resource Monitoring for Robust High Throughput Computing , 2015, 2015 IEEE International Conference on Cluster Computing.

[2]  Miron Livny,et al.  Pegasus, a workflow management system for science automation , 2015, Future Gener. Comput. Syst..

[3]  Ewa Deelman,et al.  Introducing PRECIP: An API for Managing Repeatable Experiments in the Cloud , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[4]  Douglas Thain,et al.  Toward fine-grained online task characteristics estimation in scientific workflows , 2013, WORKS@SC.

[5]  Carole A. Goble,et al.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud , 2013, Nucleic Acids Res..

[6]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[7]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Radu Prodan,et al.  A Multi-objective Approach for Workflow Scheduling in Heterogeneous Environments , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[9]  G. Bruce Berriman,et al.  Using Clouds for Science, is it just Kicking the Can down the Road? , 2012, CLOSER.

[10]  Ewa Deelman,et al.  Workflow overhead analysis and optimizations , 2011, WORKS '11.

[11]  Keith Beattie,et al.  Metrics for heterogeneous scientific workflows: A case study of an earthquake science application , 2011, Int. J. High Perform. Comput. Appl..

[12]  G. Bruce Berriman,et al.  Scientific workflow applications on Amazon EC2 , 2010, 2009 5th IEEE International Conference on E-Science Workshops.

[13]  Daniel S. Katz,et al.  Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking , 2009, Int. J. Comput. Sci. Eng..

[14]  G. Bruce Berriman,et al.  On the Use of Cloud Computing for Scientific Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[15]  M. Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[17]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[18]  Michael Wilde,et al.  Kickstarting remote applications , 2006 .

[19]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[20]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..