Towards Automated Workflow Deployment in the Cloud Using TOSCA

Scientific workflows play an increasingly important role in building scientific applications, while cloud computing provides on-demand access to large compute resources. Combining the two offers the potential to increase dramatically the ability to quickly extract new results from the vast amounts of scientific data now being collected. However, with the proliferation of cloud computing platforms and workflow management systems, it becomes more and more challenging to define workflows so they can reliably run in the cloud and be reused easily. This paper shows how TOSCA, a new standard for cloud service management, can be used to systematically specify the components and life cycle management of scientific workflows by mapping the basic elements of a real workflow onto entities specified by TOSCA. Ultimately, this will enable workflow definitions that are portable across clouds, resulting in the greater reusability and reproducibility of workflows.

[1]  Shiyong Lu,et al.  Enabling scalable scientific workflow management in the Cloud , 2015, Future Gener. Comput. Syst..

[2]  Carole A. Goble,et al.  myExperiment: a repository and social network for the sharing of bioinformatics workflows , 2010, Nucleic Acids Res..

[3]  Schahram Dustdar,et al.  Towards Automated IoT Application Deployment by a Cloud-Based Approach , 2013, 2013 IEEE 6th International Conference on Service-Oriented Computing and Applications.

[4]  Carole A. Goble,et al.  Why workflows break — Understanding and combating decay in Taverna workflows , 2012, 2012 IEEE 8th International Conference on E-Science.

[5]  Paul Watson,et al.  Developing cloud applications using the e-Science Central platform , 2013, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[6]  Frank Leymann,et al.  Integrating Configuration Management with Model-driven Cloud Management based on TOSCA , 2013, CLOSER.

[7]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[8]  Borja Sotomayor,et al.  Deploying Bioinformatics Workflows on Clouds with Galaxy and Globus Provision , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[9]  Marjan Gusev,et al.  Creating portable TOSCA archive for iKnow University Management System , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[10]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[11]  Simon Moser,et al.  Topology and Orchestration Specification for Cloud Applications Version 1.0 , 2013 .

[12]  Oliver Kopp,et al.  TOSCA: Portable Automated Deployment and Management of Cloud Applications , 2014, Advanced Web Services.