Architecting Cloud Workflow: Theory and Practice

The data scale, science analysis and processing complexity in scientific community are growing exponentially in the "big data" era. Cloud computing paradigm has been widely adopted to provide unprecedented scalability and resources on demand, while scientific workflow management systems (SWFMSs) have been proven essential to scientific computing and services computing. Uniting the advantages of both cloud computing and SWFMSs can bring a valuable solution to the scientific "big data" problem to researchers. Although a series of work have concentrated on integrating SWFMSs with Cloud platforms that provide much experience for future research and development, a study from an architectural perspective is still missing. The main contributions of this paper are: 1) based on a comprehensive survey of the available integration options, we propose a service framework for integrating SWFMSs with Cloud computing, 2) we implement the service framework based on various Cloud platforms to validate the feasibility of the proposed framework, and 3) we conduct a set of experiments to demonstrate the capability and use a NASA MODIS image processing workflow as a showcase of the implementation.

[1]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[2]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[3]  Li-zhen Cui,et al.  A workflow-oriented cloud computing framework and programming model for data intensive application , 2011, Proceedings of the 2011 15th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[4]  Miklós Kozlovszky,et al.  Enabling Generic Distributed Computing Infrastructure Compatibility for Workflow Management Systems , 2012, Comput. Sci..

[5]  Chonho Lee,et al.  Workflow framework to support data analytics in cloud computing , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[6]  Yong Zhao,et al.  A notation and system for expressing and executing cleanly typed workflows on messy scientific data , 2005, SGMD.

[7]  Bhaskar Prasad Rimal,et al.  A Framework of Scientific Workflow Management Systems for Multi-tenant Cloud Orchestration Environment , 2010, 2010 19th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises.

[8]  Yong Zhao,et al.  Opportunities and Challenges in Running Scientific Workflows on the Cloud , 2011, 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.

[9]  Long Wang,et al.  An Iterative Optimization Framework for Adaptive Workflow Management in Computational Clouds , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[10]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[11]  Gregor von Laszewski,et al.  Swift: Fast, Reliable, Loosely Coupled Parallel Computation , 2007, 2007 IEEE Congress on Services (Services 2007).

[12]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[13]  Gordon Bell,et al.  Beyond the Data Deluge , 2009, Science.

[14]  Jing Hua,et al.  Service-Oriented Architecture for VIEW: A Visual Scientific Workflow Management System , 2008, 2008 IEEE International Conference on Services Computing.

[15]  Alexandru Iosup,et al.  Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[16]  Jing Hua,et al.  A Reference Architecture for Scientific Workflow Management Systems and the VIEW SOA Solution , 2009, IEEE Transactions on Services Computing.

[17]  Danilo Ardagna,et al.  Proceedings of the 2013 international workshop on Multi-cloud applications and federated clouds , 2013, ICPE 2013.

[18]  Giandomenico Spezzano,et al.  Autonomic management of workflows on hybrid Grid-Cloud infrastructure , 2011, 2011 7th International Conference on Network and Service Management.

[19]  J. Tao,et al.  A broker-based framework for multi-cloud workflows , 2013, MultiCloud '13.

[20]  Xiao Liu,et al.  A Generic QoS Framework for Cloud Workflow Systems , 2011, 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing.