The practice of computing across two or more data centers separated by the Internet is growing in popularity due to an explosion in scalable computing demands and pay-as-you-go schemes offered on the cloud. While cloud-bursting is addressing this process of scaling up and down across data centers (i.e. between private and public clouds), offering service level guarantees, is a challenge for inter-cloud computation, particularly for best-effort traffic and large files. The parallel workload we address is real-time and involves inter-cloud processing and analysis of images and documents. In our production printing domain, dedicated processing/network resources are cost-prohibitive. Further, the problem is exacerbated by data intensive computing - we encounter huge file sizes atypical of intercloud parallel processing. To address these problems we propose three flavors of autonomic cloud-bursting schedulers that offer probabilistic guarantees on service levels required by customers (such as speed-up and queue sequence preservation) by adapting to changing workload characteristics, variation in bandwidth and available resources. In particular, these opportunistic schedulers use a quadratic response surface model for processing time in concert with a time-of-day dependent bandwidth predictor to increase the throughput and utilization while simultaneously reducing out-of-sequence completions for a document processing workload.
[1]
Manish Parashar,et al.
Online Risk Analytics on the Cloud
,
2009,
2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.
[2]
Peter M. A. Sloot,et al.
The distributed ASCI Supercomputer project
,
2000,
OPSR.
[3]
Liana L. Fong,et al.
Grid broker selection strategies using aggregated resource information
,
2010,
Future Gener. Comput. Syst..
[4]
P. Mell,et al.
The NIST Definition of Cloud Computing
,
2011
.
[5]
David P. Anderson,et al.
SETI@home: an experiment in public-resource computing
,
2002,
CACM.
[6]
Lin Yang,et al.
Investigating the use of autonomic cloudbursts for high-throughput medical image registration
,
2009,
2009 10th IEEE/ACM International Conference on Grid Computing.
[7]
Douglas C. Montgomery,et al.
Response Surface Methodology: Process and Product Optimization Using Designed Experiments
,
1995
.
[8]
Mor Harchol-Balter.
Task assignment with unknown duration
,
2002,
JACM.
[9]
Ian T. Foster,et al.
Globus: a Metacomputing Infrastructure Toolkit
,
1997,
Int. J. High Perform. Comput. Appl..