Optimizing Multi-deployment on Clouds by Means of Self-adaptive Prefetching

With Infrastructure-as-a-Service (IaaS) cloud economics getting increasingly complex and dynamic, resource costs can vary greatly over short periods of time. Therefore, a critical issue is the ability to deploy, boot and terminate VMs very quickly, which enables cloud users to exploit elasticity to find the optimal trade-off between the computational needs (number of resources, usage time) and budget constraints. This paper proposes an adaptive prefetching mechanism aiming to reduce the time required to simultaneously boot a large number of VM instances on clouds from the same initial VM image (multi-deployment). Our proposal does not require any foreknowledge of the exact access pattern. It dynamically adapts to it at run time, enabling the slower instances to learn from the experience of the faster ones. Since all booting instances typically access only a small part of the virtual image along almost the same pattern, the required data can be pre-fetched in the background. Large scale experiments under concurrency on hundreds of nodes show that introducing such a prefetching mechanism can achieve a speed-up of up to 35% when compared to simple on-demand fetching.

[1]  Gabriel Antoniu,et al.  Going back and forth: efficient multideployment and multisnapshotting on clouds , 2011, HPDC '11.

[2]  Lakshmi Sobhana Kalli,et al.  Market-Oriented Cloud Computing : Vision , Hype , and Reality for Delivering IT Services as Computing , 2013 .

[3]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[4]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[5]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[6]  Gabriel Antoniu,et al.  Going Back and Forth: Efficient Virtual Machine Image Deployment and Snapshotting on IaaS Clouds , 2010 .

[7]  Jesús Carretero,et al.  Resource selection for fast large-scale Virtual Appliances Propagation , 2009, 2009 IEEE Symposium on Computers and Communications.

[8]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[9]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[10]  Erol Gelenbe,et al.  Adaptive prefetching algorithm in disk controllers , 2008, Perform. Evaluation.

[11]  Sebastien Goasguen,et al.  Image Distribution Mechanisms in Large Scale Cloud Providers , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[12]  Gabriel Antoniu,et al.  BlobSeer: Next-generation data management for large scale infrastructures , 2011, J. Parallel Distributed Comput..

[13]  Artur Andrzejak,et al.  Decision Model for Cloud Computing under SLA Constraints , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[14]  Bogdan Nicolae,et al.  BlobSeer: Towards efficient data storage management for large-scale, distributed systems , 2010 .

[15]  Marcel Gagné Cooking with Linux: still searching for the ultimate linux distro? , 2007 .

[16]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.