Chapter Two - Revisiting VM performance and optimization challenges for big data

Abstract The concept of virtualization in cloud computing aims to maximize resource utilization and minimize cost by deploying multiple Virtual Machines (VMs) on a single physical server sharing resources such as CPU, Cache, I/O, and Memory. The sharing of these resources can cause severe performance degradation, thus requiring VM migration techniques for performance enhancement. The introduction of big data has made the performance enhancement more challenging due to extensive volume, velocity, variety, variability, and veracity of data. The existing VM performance and optimization challenges need to be optimized for big data use cases. This chapter presents big data triggered VM performance challenges focusing big data applications and storage migration in cloud computing. State of the art VM migration techniques are evaluated against challenges posed by big data to outline possible solutions and research challenges.

[1]  Hai Jin,et al.  Performance and energy modeling for live migration of virtual machines , 2011, Cluster Computing.

[2]  Jaehyuk Huh,et al.  Dynamic Virtual Machine Scheduling in Clouds for Architectural Shared Resources , 2012, HotCloud.

[3]  Gang Sun,et al.  Live Migration for Multiple Correlated Virtual Machines in Cloud-Based Data Centers , 2018, IEEE Transactions on Services Computing.

[4]  Xiao Zhang,et al.  Towards practical page coloring-based multicore cache management , 2009, EuroSys '09.

[5]  Andy Hopper,et al.  Predicting the Performance of Virtual Machine Migration , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[6]  Johan A. K. Suykens,et al.  Multilevel Hierarchical Kernel Spectral Clustering for Real-Life Large Scale Complex Networks , 2014, PloS one.

[7]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[8]  Amin Vahdat,et al.  NicPic: Scalable and Accurate End-Host Rate Limiting , 2013, HotCloud.

[9]  Franck Cappello,et al.  A hybrid local storage transfer scheme for live migration of I/O intensive workloads , 2012, HPDC '12.

[10]  Jian Yang,et al.  Parallelizing live migration of virtual machines , 2013, VEE '13.

[11]  H. Howie Huang,et al.  TRACON: Interference-Aware Schedulingfor Data-Intensive Applicationsin Virtualized Environments , 2011, IEEE Transactions on Parallel and Distributed Systems.

[12]  Jie Zheng,et al.  Workload-aware live storage migration for clouds , 2011, VEE '11.

[13]  Satoshi Sekiguchi,et al.  A Live Storage Migration Mechanism over WAN for Relocatable Virtual Machine Services on Clouds , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[14]  Qian Zhu,et al.  Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[15]  Vishal Misra,et al.  VMTorrent: scalable P2P virtual machine streaming , 2012, CoNEXT '12.

[16]  Justin Zhan,et al.  Real-time large-scale big data networks analytics and visualization architecture , 2015, 2015 12th International Conference & Expo on Emerging Technologies for a Smarter World (CEWIT).

[17]  Gautam Kumar,et al.  FairCloud: sharing the network in cloud computing , 2011, CCRV.

[18]  Xiaohua Jia,et al.  Supporting Seamless Virtual Machine Migration via Named Data Networking in Cloud Data Center , 2015, IEEE Transactions on Parallel and Distributed Systems.

[19]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[20]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[21]  Petter Svärd,et al.  Evaluation of delta compression techniques for efficient live migration of large virtual machines , 2011, VEE '11.

[22]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[23]  Bernhard Egger,et al.  Efficient live migration of virtual machines using shared storage , 2013, VEE '13.

[24]  David Mazières,et al.  EyeQ: Practical Network Performance Isolation for the Multi-tenant Cloud , 2012, HotCloud.

[25]  Dorgival O. Guedes,et al.  Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks , 2011, WIOV.

[26]  Tzi-cker Chiueh,et al.  Introspection-based memory de-duplication and migration , 2013, VEE '13.

[27]  Jie Liu,et al.  Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines , 2011, SoCC.

[28]  Tal Garfinkel,et al.  XvMotion: Unified Virtual Machine Migration over Long Distance , 2014, USENIX Annual Technical Conference.

[29]  Hai Jin,et al.  Live virtual machine migration with adaptive, memory compression , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[30]  Chita R. Das,et al.  Migration, Assignment, and Scheduling of Jobs in Virtualized Environment , 2011, HotCloud.

[31]  Richard Branch,et al.  Cloud Computing and Big Data: A Review of Current Service Models and Hardware Perspectives , 2014 .

[32]  Guy Pujolle,et al.  Improving Network I/O Virtualization for Cloud Computing , 2014, IEEE Transactions on Parallel and Distributed Systems.

[33]  Ying Zhang,et al.  Towards bandwidth guarantee in multi-tenancy cloud computing networks , 2012, ICNP.

[34]  Muhammad Sharif,et al.  Virtualization Tools and Techniques: Survey , 2015 .

[35]  Albert G. Greenberg,et al.  EyeQ: Practical Network Performance Isolation at the Edge , 2013, NSDI.

[36]  Ning Ding,et al.  The only constant is change: incorporating time-varying network reservations in data centers , 2012, SIGCOMM.

[37]  G. Asokan,et al.  Leveraging “big data” to enhance the effectiveness of “one health” in an era of health informatics , 2015, Journal of epidemiology and global health.

[38]  Antti Ylä-Jääski,et al.  Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 , 2012, HotCloud.

[39]  G. Santhosh Kumar,et al.  Virtualization Techniques: A Methodical Review of XEN and KVM , 2011, ACC.

[40]  Anja Feldmann,et al.  Live wide-area migration of virtual machines including local persistent state , 2007, VEE '07.

[41]  Heeseung Jo,et al.  XHive: Efficient Cooperative Caching for Virtual Machines , 2011, IEEE Transactions on Computers.

[42]  Anees Shaikh,et al.  Programming your network at run-time for big data applications , 2012, HotSDN '12.

[43]  Charalampos Z. Patrikakis,et al.  Cloud Federation and the Evolution of Cloud Computing , 2016, Computer.

[44]  Qian Zhu,et al.  A Performance Interference Model for Managing Consolidated Workloads in QoS-Aware Clouds , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[45]  Albert G. Greenberg,et al.  Sharing the Data Center Network , 2011, NSDI.

[46]  Bo Li,et al.  iAware: Making Live Migration of Virtual Machines Interference-Aware in the Cloud , 2014, IEEE Transactions on Computers.

[47]  A. Stuart,et al.  Kendall's Advanced Theory of Statistics, Volume 1: Distribution Theory , 1988 .

[48]  Brian D. Noble,et al.  Bobtail: Avoiding Long Tails in the Cloud , 2013, NSDI.

[49]  Umesh Deshpande,et al.  Live gang migration of virtual machines , 2011, HPDC '11.

[50]  Gilad Kutiel,et al.  Cost-aware live migration of services in the cloud , 2010, SYSTOR '10.

[51]  Raghu P. Pushpakath,et al.  Quantitative assessment of applications for cloud bursting , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[52]  Johan A. K. Suykens,et al.  Self-tuned kernel spectral clustering for large scale networks , 2013, 2013 IEEE International Conference on Big Data.

[53]  Chuang Lin,et al.  Big data: transforming the design philosophy of future internet , 2014, IEEE Network.

[54]  Prashant J. Shenoy,et al.  CloudNet: dynamic pooling of cloud resources by live WAN migration of virtual machines , 2011, VEE.

[55]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[56]  Laizhong Cui,et al.  When big data meets software-defined networking: SDN for big data and big data for SDN , 2016, IEEE Network.

[57]  Sujata Banerjee,et al.  ElasticSwitch: practical work-conserving bandwidth guarantees for cloud computing , 2013, SIGCOMM.

[58]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[59]  Johan A. K. Suykens,et al.  Kernel Spectral Clustering for Big Data Networks , 2013, Entropy.

[60]  Abhishek Chandra,et al.  Does virtualization make disk scheduling passé? , 2010, OPSR.

[61]  Gabriel Antoniu,et al.  Going back and forth: efficient multideployment and multisnapshotting on clouds , 2011, HPDC '11.

[62]  Ryousei Takano,et al.  Fast Wide Area Live Migration with a Low Overhead through Page Cache Teleportation , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[63]  Taegyu Hwang,et al.  LACS: latency-aware credit scheduler to improve responsiveness , 2015, RACS.

[64]  Hiroshi Yamada,et al.  Towards unobtrusive VM live migration for cloud computing platforms , 2012, APSys.

[65]  Matei Ripeanu,et al.  VMFlock: virtual machine co-migration for the cloud , 2011, HPDC '11.

[66]  Hai Jin,et al.  Live migration of virtual machine based on full system trace and replay , 2009, HPDC '09.

[67]  Tal Garfinkel,et al.  The Design and Evolution of Live Storage Migration in VMware ESX , 2011, USENIX Annual Technical Conference.

[68]  Johan A. K. Suykens,et al.  Representative subsets for big data learning using k-NN graphs , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[69]  Jiankang Dong,et al.  Virtual machine placement optimizing to improve network performance in cloud data centers , 2014 .

[70]  Yun Tian,et al.  SecHDFS: A Secure Data Allocation Scheme for Heterogenous Hadoop Systems , 2016, 2016 IEEE International Conference on Networking, Architecture and Storage (NAS).

[71]  Bingsheng He,et al.  VMbuddies: Coordinating Live Migration of Multi-Tier Applications in Cloud Environments , 2015, IEEE Transactions on Parallel and Distributed Systems.

[72]  Ching-Hsien Hsu,et al.  Automatic Memory Control of Multiple Virtual Machines on a Consolidated Server , 2017, IEEE Transactions on Cloud Computing.

[73]  Xiaohui Gu,et al.  CloudScale: elastic resource scaling for multi-tenant cloud systems , 2011, SoCC.

[74]  David J. Lilja,et al.  Romano: autonomous storage management using performance prediction in multi-tenant datacenters , 2012, SoCC '12.