Supporting Data-Intensive Workflows in Software-Defined Federated Multi-Clouds

Cloud computing is emerging as a viable platform for scientific exploration. Elastic and on-demand access to resources (and other services), the abstraction of “unlimited” resources, and attractive pricing models provide incentives for scientists to move their workflows into clouds. Generalizing these concepts beyond a single virtualized datacenter, it is possible to create federated marketplaces where different types of resources (e.g., clouds, HPC grids, supercomputers) that may be geographically distributed, are collectively exposed as a single elastic infrastructure. This presents opportunities for optimizing the execution of application workflows with heterogeneous and dynamic requirements, and tackling larger scale problems. In this paper, we introduce a framework to manage the end-to-end execution of data-intensive application workflows in dynamic software-defined resource federation. This framework enables the autonomic execution of workflows by elastically provisioning an appropriate set of resources that meet application requirements, and by adapting this set of resources at runtime as the requirements change. It also allows users to customize scheduling policies that drive the way resources federated and used. To demonstrate the benefits of our approach, we study the execution of two different data-intensive scientific workflows in a multi-cloud federation using different policies and objective functions.

[1]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[2]  Francine Berman,et al.  Overview of the Book: Grid Computing – Making the Global Infrastructure a Reality , 2003 .

[3]  Manish Parashar,et al.  Market Models for Federated Clouds , 2015, IEEE Transactions on Cloud Computing.

[4]  Jie Li,et al.  Cloud auto-scaling with deadline and budget constraints , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[5]  Jeffrey O. Kephart,et al.  The Vision of Autonomic Computing , 2003, Computer.

[6]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[7]  Manish Parashar,et al.  Cloud Paradigms and Practices for Computational and Data-Enabled Science and Engineering , 2013, Computing in Science & Engineering.

[8]  Renato Figueiredo,et al.  Science Clouds: Early Experiences in Cloud Computing for Scientific Applications , 2008 .

[9]  Tram Truong Huu,et al.  Virtual Resources Allocation for Workflow-Based Applications Distribution on a Cloud Infrastructure , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[10]  Xiaohui Gu,et al.  AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service , 2013, ICAC.

[11]  Dick H. J. Epema,et al.  Cost-driven scheduling of grid workflows using Partial Critical Paths , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[12]  Radu Prodan,et al.  Dynamic Cloud provisioning for scientific Grid workflows , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[13]  Jan Broeckhove,et al.  Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[14]  Debra A. Hensgen,et al.  The relative performance of various mapping algorithms is independent of sizable variances in run-time predictions , 1998, Proceedings Seventh Heterogeneous Computing Workshop (HCW'98).

[15]  Ronald L. Graham Combinatorial Scheduling Theory , 1978 .

[16]  Daniel Grosu,et al.  Cloud Federations in the Sky: Formation Game and Mechanism , 2015, IEEE Transactions on Cloud Computing.

[17]  Mats Rynge,et al.  Supporting Shared Resource Usage for a Diverse User Community: the OSG Experience and Lessons Learned , 2012 .

[18]  Daniel Grosu,et al.  Formation of Virtual Organizations in Grids: A Game-Theoretic Approach , 2010, Economic Models and Algorithms for Distributed Systems.

[19]  Antonio Puliafito,et al.  How to Enhance Cloud Architectures to Enable Cross-Federation , 2010, IEEE CLOUD.

[20]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[21]  José A. B. Fortes,et al.  Large-Scale Cloud Computing Research: Sky Computing on FutureGrid and Grid'5000 , 2010, ERCIM News.

[22]  Viktor K. Prasanna,et al.  Heterogeneous computing: challenges and opportunities , 1993, Computer.

[23]  Alexander S. Szalay,et al.  Data-Intensive Computing in the 21st Century , 2008, Computer.

[24]  Manish Parashar,et al.  CometCloud: Enabling Software-Defined Federations for End-to-End Application Workflows , 2015, IEEE Internet Computing.

[25]  Joseph L. Hellerstein,et al.  Predictive models for proactive network management: application to a production Web server , 2000, NOMS 2000. 2000 IEEE/IFIP Network Operations and Management Symposium 'The Networked Planet: Management Beyond 2000' (Cat. No.00CB37074).

[26]  Salim Hariri,et al.  Autonomic Computing: An Overview , 2004, UPP.

[27]  Yi Yang,et al.  An Autonomic Performance-Aware Workflow Job Management for Service-Oriented Computing , 2010, 2010 Ninth International Conference on Grid and Cloud Computing.

[28]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[29]  Dennis Gannon,et al.  Cloud Programming Paradigms for Technical Computing Applications , 2012 .

[30]  Gagan Agrawal,et al.  Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[31]  Manish Parashar,et al.  Exploring Models and Mechanisms for Exchanging Resources in a Federated Cloud , 2014, 2014 IEEE International Conference on Cloud Engineering.

[32]  Autoflex: Service Agnostic Auto-scaling Framework for IaaS Deployment Models , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[33]  Sebastián Reyes,et al.  Derivation of self-scheduling algorithms for heterogeneous distributed computer systems: Application to internet-based grids of computers , 2009, Future Gener. Comput. Syst..

[34]  Ewa Deelman,et al.  The cost of doing science on the cloud: the Montage example , 2008, HiPC 2008.

[35]  Joseph L. Hellerstein,et al.  Predictive algorithms in the management of computer systems , 2002, IBM Syst. J..

[36]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[37]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[38]  Djamal Zeghlache,et al.  Mathematical Programming Approach for Revenue Maximization in Cloud Federations , 2017, IEEE Transactions on Cloud Computing.

[39]  Daniel S. Katz,et al.  Computational Science, Infrastructure and Interdisciplinary Research on University Campuses: Experie , 2009 .

[40]  Guillaume Pierre,et al.  EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications , 2009, ICSOC/ServiceWave Workshops.

[41]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004..

[42]  Ishfaq Ahmad,et al.  Optimal task assignment in heterogeneous distributed computing systems , 1998, IEEE Concurr..

[43]  Ioannis Konstantinou,et al.  Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[44]  Baskar Ganapathysubramanian,et al.  Exploring the Use of Elastic Resource Federations for Enabling Large-Scale Scientific Workflows , 2013 .

[45]  Arif Ghafoor,et al.  A distributed heterogeneous supercomputing management system , 1993, Computer.

[46]  Jie Liu,et al.  PACMan: Performance Aware Virtual Machine Consolidation , 2013, ICAC.

[47]  Liana L. Fong,et al.  Cloud federation in a layered service model , 2012, J. Comput. Syst. Sci..