Effect of Live Migration on Virtual Hadoop Cluster

Emerging computational requirement for large scale data analysis has resulted in the importance of big data processing. Meanwhile, with virtualization it is now feasible to deploy Hadoop in private or public cloud environment which offers unique benefits like scalability, high availability etc. Live migration is an important feature provided by virtualization that migrate a running VM from one physical host to another to facilitate load balancing, maintenance, server consolidation and avoid SLA violation of VM. However, live migration adds overhead and degrades the performance of the application running inside the VM. This paper discusses the performance of Hadoop when VMs are migrated from one host to another. Experiment shows that job completion time, average downtime as well as average migration time gets increased with increase in the number of VMs that are migrated.

[1]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[2]  Hongxu Ma,et al.  Deploying and researching Hadoop in virtual machines , 2012, 2012 IEEE International Conference on Automation and Logistics.

[3]  Timothy Wood,et al.  A component-based performance comparison of four hypervisors , 2013, 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013).

[4]  Himabindu Pucha,et al.  Towards Optimizing Hadoop Provisioning in the Cloud , 2009, HotCloud.

[5]  Hai Jin,et al.  Evaluating MapReduce on Virtual Machines: The Hadoop Case , 2009, CloudCom.

[6]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[7]  David Chiu,et al.  Hadoop in Flight: Migrating Live MapReduce Jobs for Power-Shifting Data Centers , 2016, 2016 IEEE 9th International Conference on Cloud Computing (CLOUD).

[8]  Xiaohong Jiang,et al.  Analyzing and Modeling the Performance in Xen-Based Virtual Cluster Environment , 2010, 2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC).