Estimation of the Available Bandwidth in Inter-Cloud Links for Task Scheduling in Hybrid Clouds

In hybrid clouds, inter-cloud links play a key role in the execution of jobs with data dependencies. Insufficient available bandwidth in inter-cloud links can increase the makespan and the monetary cost to execute the application on public clouds. Imprecise information about the available bandwidth can lead to inefficient scheduling decisions. This paper attempts to evaluate the impact of imprecise information about the available bandwidth in inter-cloud links on workflow schedules, and it proposes a mechanism to cope with imprecise information about the available bandwidth and its impact on the makespan and cost estimates. The proposed mechanism applies a deflating factor on the available bandwidth value furnished as input to the scheduler. Simulation results showed that the mechanism is able to increase the number of solutions with makespans that are shorter than the defined deadline and reduce the underestimations of the makespan and cost provided by workflow schedulers.

[1]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[2]  Rizos Sakellariou,et al.  A low-cost rescheduling policy for efficient mapping of workflows on grid systems , 2004, Sci. Program..

[3]  Antonio Corradi,et al.  A Stable Network-Aware VM Placement for Cloud Systems , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[4]  Luiz Fernando Bittencourt,et al.  Workflow scheduling for SaaS / PaaS cloud providers considering two SLA levels , 2012, 2012 IEEE Network Operations and Management Symposium.

[5]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[6]  Daniel M. Batista,et al.  Robust scheduler for grid networks under uncertainties of both application demands and resource availability , 2011, Comput. Networks.

[7]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[8]  Hari Balakrishnan,et al.  Choreo: network-aware task placement for cloud applications , 2013, Internet Measurement Conference.

[9]  Helen J. Wang,et al.  SecondNet: a data center network virtualization architecture with bandwidth guarantees , 2010, CoNEXT.

[10]  John Shalf,et al.  The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment , 2001, Int. J. High Perform. Comput. Appl..

[11]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[12]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[13]  Yong Zhao,et al.  A notation and system for expressing and executing cleanly typed workflows on messy scientific data , 2005, SGMD.

[14]  Rajkumar Buyya,et al.  A Particle Swarm Optimization-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[15]  Daniel M. Batista,et al.  Self-adjustment of resource allocation for grid applications , 2008, Comput. Networks.

[16]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[17]  Daniel M. Batista,et al.  Performance analysis of available bandwidth estimation tools for grid networks , 2009, 2009 IEEE 14th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks.

[18]  Ann L. Chervenak,et al.  Characterizing and profiling scientific workflows , 2013, Future Gener. Comput. Syst..

[19]  Ewa Deelman,et al.  Community Resources for Enabling Research in Distributed Scientific Workflows , 2014, 2014 IEEE 10th International Conference on e-Science.

[20]  Nelson Luis Saldanha da Fonseca,et al.  Impact of communication uncertainties on workflow scheduling in hybrid clouds , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[21]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[22]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[23]  Luiz Fernando Bittencourt,et al.  A performance‐oriented adaptive scheduler for dependent tasks on grids , 2008, Concurr. Comput. Pract. Exp..

[24]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[25]  Nelson Luis Saldanha da Fonseca,et al.  Refining the estimation of the available bandwidth in inter-cloud links for task scheduling , 2014, 2014 IEEE Global Communications Conference.

[26]  Nelson Luis Saldanha da Fonseca,et al.  Scheduling in hybrid clouds , 2012, IEEE Communications Magazine.

[27]  Yong Zhao,et al.  Applying Chimera Virtual Data Concepts to Cluster Finding in the Sloan Sky Survey , 2002, ACM/IEEE SC 2002 Conference (SC'02).