Adapting Scientific Workflows on Networked Clouds Using Proactive Introspection

Recent advances in cloud technologies and on-demand network circuits have created an unprecedented opportunity to enable complex data-intensive scientific applications to run on dynamic, networked cloud infrastructure. However, there is a lack of tools for supporting high-level applications like scientific workflows on dynamically provisioned, virtualized, networked IaaS (NIaaS) systems. In this paper, we propose an end-to-end system consisting of application-aware and application-independent controllers that provision and adapt complex scientific workflows on NIaaS systems. The application-independent controller enhances the utility of NIaaS systems for higher-level applications by closing the gap between application abstractions and resource provisioning constructs. We also present our approach to predicting dynamic resource requirements for workflows using an application-aware controller that proactively evaluates alternative candidate resource allotments using workflow introspection. We show how these high-level resource requirements can be automatically transformed to low-level NIaaS operations to actuate infrastructure adaptation. The results of our evaluations show that we can make fairly accurate predictions, and the interplay of prediction and adaptation can balance performance and utilization for a representative data-intensive workflow.

[1]  Radu Prodan,et al.  Dynamic Cloud provisioning for scientific Grid workflows , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[2]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[3]  Rizos Sakellariou,et al.  A Performance Model to Estimate Execution Time of Scientific Workflows on the Cloud , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.

[4]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.

[5]  Aydan R. Yumerefendi,et al.  Beyond Virtual Data Centers : Toward an Open Resource Control Architecture , 2007 .

[6]  Omer F. Rana,et al.  Dynamic Workflow Adaptation over Adaptive Infrastructures , 2011, KES-AMSTA.

[7]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[8]  Alexandru Iosup,et al.  Trace-based evaluation of job runtime and queue wait time predictions in grids , 2009, HPDC '09.

[9]  G. Bruce Berriman,et al.  An Evaluation of the Cost and Performance of Scientific Workflows on Amazon EC2 , 2012, Journal of Grid Computing.

[10]  Rahul Singh,et al.  Data-Driven Workflows in Multi-cloud Marketplaces , 2014, 2014 IEEE 7th International Conference on Cloud Computing.

[11]  Marian Bubak,et al.  Scheduling Multilevel Deadline-Constrained Scientific Workflows on Clouds Based on Cost Optimization , 2015, Sci. Program..

[12]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[13]  David E. Irwin,et al.  Sharing Networked Resources with Brokered Leases , 2006, USENIX Annual Technical Conference, General Track.

[14]  Ian T. Foster,et al.  Globus Online: Accelerating and Democratizing Science through Cloud-Based Services , 2011, IEEE Internet Computing.

[15]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[16]  Schahram Dustdar,et al.  Workflow Scheduling and Resource Allocation for Cloud-Based Execution of Elastic Processes , 2013, 2013 IEEE 6th International Conference on Service-Oriented Computing and Applications.

[17]  Liam O'Brien,et al.  A factor framework for experimental design for performance evaluation of commercial cloud services , 2012, 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings.

[18]  Radu Prodan,et al.  A Truthful Dynamic Workflow Scheduling Mechanism for Commercial Multicloud Environments , 2013, IEEE Transactions on Parallel and Distributed Systems.

[19]  Ewa Deelman,et al.  Experiences using cloud computing for a scientific workflow application , 2011, ScienceCloud '11.

[20]  Xiaohui Gu,et al.  AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service , 2013, ICAC.

[21]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[22]  Douglas Thain,et al.  Toward fine-grained online task characteristics estimation in scientific workflows , 2013, WORKS@SC.

[23]  Ian Foster,et al.  Predicting application run times with historical information , 2004, J. Parallel Distributed Comput..

[24]  Jianwu Wang,et al.  Early Cloud Experiences with the Kepler Scientific Workflow System , 2012, ICCS.

[25]  Rajkumar Buyya,et al.  Deadline Based Resource Provisioningand Scheduling Algorithm for Scientific Workflows on Clouds , 2014, IEEE Transactions on Cloud Computing.

[26]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[27]  Ewa Deelman,et al.  Scientific workflows and clouds , 2010, ACM Crossroads.

[28]  Jin-Soo Kim,et al.  Estimating Resource Needs for Time-Constrained Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[29]  Andrew A. Chien,et al.  Automatic resource specification generation for resource selection , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[30]  L. Ramakrishnan,et al.  Toward a Doctrine of Containment: Grid Hosting with Adaptive Resource Control , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[31]  Daniel S. Katz,et al.  Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking , 2009, Int. J. Comput. Sci. Eng..

[32]  Dick H. J. Epema,et al.  Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds , 2013, Future Gener. Comput. Syst..

[33]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[34]  Link,et al.  A semantic model for complex computer networks : the network description language , 2010 .

[35]  Jeffrey S. Chase,et al.  ExoGENI: A Multi-Domain Infrastructure-as-a-Service Testbed , 2012, The GENI Book.

[36]  Rajkumar Buyya,et al.  Adaptive workflow scheduling for dynamic grid and cloud computing environment , 2013, Concurr. Comput. Pract. Exp..

[37]  Julien Gossa,et al.  Comparing Provisioning and Scheduling Strategies for Workflows on Clouds , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.

[38]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[39]  WenAn Tan,et al.  QoS Constraint Based Workflow Scheduling for Cloud Computing Services , 2014, J. Softw..

[40]  Cees T. A. M. de Laat,et al.  A distributed topology information system for optical networks based on the semantic web , 2008, Opt. Switch. Netw..

[41]  J. Chris Anderson,et al.  CouchDB - The Definitive Guide: Time to Relax , 2010 .

[42]  Jarek Nabrzyski,et al.  Hosted Science: Managing Computational Workflows in the Cloud , 2013, Parallel Process. Lett..

[43]  Jeffrey S. Chase,et al.  Automated control in cloud computing: challenges and opportunities , 2009, ACDC '09.

[44]  Adam Belloum,et al.  Execution Time Estimation for Workflow Scheduling , 2014, 2014 9th Workshop on Workflows in Support of Large-Scale Science.