Refining the estimation of the available bandwidth in inter-cloud links for task scheduling

In hybrid clouds, the available bandwidth in inter-cloud links is quite variable. Overestimating the available bandwidth on theses channels at scheduling time can enlarge the makespan and cause deadline misses. In this paper, we propose a procedure for deflating the estimated available bandwidth used as input to cloud schedulers since schedulers are not usually designed to cope with inaccurate information on available bandwidth. The procedure is based on a multiple linear regression procedure which utilizes historical information of previous executions of workflows. Results showed that the proposed procedure can increase the number of valid schedules without increasing the makespan and cost estimations, regardless the variability in the available bandwidth during the execution of an application workflow.

[1]  Luiz Fernando Bittencourt,et al.  HCOC: a cost optimization algorithm for workflow scheduling in hybrid clouds , 2011, Journal of Internet Services and Applications.

[2]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[3]  Rizos Sakellariou,et al.  A low-cost rescheduling policy for efficient mapping of workflows on grid systems , 2004, Sci. Program..

[4]  John Shalf,et al.  The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment , 2001, Int. J. High Perform. Comput. Appl..

[5]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[6]  Daniel M. Batista,et al.  Performance Analysis of Available Bandwidth Estimation Tools for Grid Networks , 2009, CAMAD.

[7]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[8]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[9]  Daniel M. Batista,et al.  Robust scheduler for grid networks under uncertainties of both application demands and resource availability , 2011, Comput. Networks.

[10]  Daniel M. Batista,et al.  Self-adjustment of resource allocation for grid applications , 2008, Comput. Networks.

[11]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[12]  Nelson Luis Saldanha da Fonseca,et al.  Scheduling in hybrid clouds , 2012, IEEE Communications Magazine.

[13]  Rajkumar Buyya,et al.  Deadline-driven provisioning of resources for scientific applications in hybrid clouds with Aneka , 2012, Future Gener. Comput. Syst..

[14]  Raouf Boutaba,et al.  Cloud computing: state-of-the-art and research challenges , 2010, Journal of Internet Services and Applications.

[15]  Xiaorong Li,et al.  Hybrid Heuristic for Scheduling Data Analytics Workflow Applications in Hybrid Cloud Environment , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[16]  Nelson Luis Saldanha da Fonseca,et al.  Impact of communication uncertainties on workflow scheduling in hybrid clouds , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).