MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds

Cloud Computing with Virtualization offers attractive flexibility and elasticity to deliver resources by providing a platform for consolidating complex IT resources in a scalable manner. However, efficiently running HPC applications on Cloud Computing systems is still full of challenges. One of the biggest hurdles in building efficient HPC clouds is the unsatisfactory performance offered by underlying virtualized environments, more specifically, virtualized I/O devices. Recently, Single Root I/O Virtualization (SR-IOV) technology has been steadily gaining momentum for high-performance interconnects such as InfiniBand and 10GigE. Due to its near native performance for inter-node communication, many cloud systems such as Amazon EC2 have been using SR-IOV in their production environments. Nevertheless, recent studies have shown that the SR-IOV scheme lacks locality aware communication support, which leads to performance overheads for inter-VM communication within the same physical node. In this paper, we propose an efficient approach to build HPC clouds based on MVAPICH2 over Open Stack with SR-IOV. We first propose an extension for Open Stack Nova system to enable the IV Shmem channel in deployed virtual machines. We further present and discuss our high-performance design of virtual machine aware MVAPICH2 library over Open Stack-based HPC Clouds. Our design can fully take advantage of high-performance SR-IOV communication for inter-node communication as well as Inter-VM Shmem (IVShmem) for intra-node communication. A comprehensive performance evaluation with micro-benchmarks and HPC applications has been conducted on an experimental Open Stack-based HPC cloud and Amazon EC2. The evaluation results on the experimental HPC cloud show that our design and extension can deliver near bare-metal performance for implementing SR-IOV-based HPC clouds with virtualization. Further, compared with the performance on EC2, our experimental HPC cloud can exhibit up to 160X, 65X, 12X improvement potential in terms of point-to-point, collective and application for future HPC clouds.

[1]  Jian Li,et al.  Adaptive and Scalable Optimizations for High Performance SR-IOV , 2012, 2012 IEEE International Conference on Cluster Computing.

[2]  Dhabaleswar K. Panda,et al.  A case for high performance computing with virtual machines , 2006, ICS '06.

[3]  Xiaowei Yang,et al.  High performance network virtualization with SR-IOV , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[4]  Zhiwei Xu,et al.  Vega LingCloud: A Resource Single Leasing Point System to Support Heterogeneous Application Modes on Shared Infrastructure , 2011, 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications.

[5]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[6]  Dhabaleswar K. Panda,et al.  Nomad: migrating OS-bypass networks in virtual machines , 2007, VEE '07.

[7]  Tal Garfinkel,et al.  Virtual machine monitors: current technology and future trends , 2005, Computer.

[8]  Dhabaleswar K. Panda,et al.  SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[9]  Mikyung Kang,et al.  Heterogeneous Cloud Computing , 2011, 2011 IEEE International Conference on Cluster Computing.

[10]  Ahmad Faraj,et al.  Communication Characteristics in the NAS Parallel Benchmarks , 2002, IASTED PDCS.

[11]  Ian T. Foster,et al.  Virtual workspaces: Achieving quality of service and quality of life in the Grid , 2005, Sci. Program..

[12]  Paul Lu,et al.  Shared-memory optimizations for virtual machines , 2011 .

[13]  Jiuxing Liu Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14]  Xiaoyi Lu,et al.  JAMILA: A Usable Batch Job Management System to Coordinate Heterogeneous Clusters and Diverse Applications over Grid or Cloud Infrastructure , 2010, NPC.

[15]  Dhabaleswar K. Panda,et al.  Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters? , 2014, Euro-Par.

[16]  Gabriel Antoniu,et al.  Proceedings of the 2014 IEEE International Conference on Cluster Computing (CLUSTER). , 2014 .

[17]  Dhabaleswar K. Panda,et al.  High performance MPI library over SR-IOV enabled infiniband clusters , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[18]  Srihari Makineni,et al.  Characterization of network processing overheads in Xen , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[19]  Willy Zwaenepoel,et al.  Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[20]  Iain Robertson テクノロジー活用最前線 プライベートクラウドを作る「OpenStack」 ネット、ストレージも統合 完全自動化で構築を迅速化 , 2015 .

[21]  Dhabaleswar K. Panda,et al.  Virtual machine aware communication libraries for high performance computing , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).