论文信息 - Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters?

Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters?

Single Root I/O Virtualization (SR-IOV) technology has been introduced for high-performance interconnects such as InfiniBand. Recent studies mainly focus on performance characteristics of high-performance communication middleware (e.g. MPI) and applications on SR-IOV enabled HPC clusters. However, current SR-IOV based MPI applications do not take advantage of the locality-aware communication on intra-host inter-VM environment. Although Inter-VM Shared Memory (IVShmem) has been proven to support efficient locality-aware communication, the performance benefits of IVShmem for MPI libraries on virtualized environments are yet to be explored. In this paper, we present a comprehensive performance evaluation for IVShmem backed MPI using micro-benchmarks and HPC applications. The performance evaluations show that, through IVShmem, the performance of MPI point-to-point and collective operations can be improved up to 193% and 91%, respectively. The application performance can be improved up to 96%, compared to SR-IOV. The results further show that IVShmem just brings minor overhead compared to native environment.

Dhabaleswar K. Panda | Xiaoyi Lu | Jie Zhang | Rong Shi | Jithin Jose

[1] Dhabaleswar K. Panda,et al. SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[2] Willy Zwaenepoel,et al. Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[3] Dhabaleswar K. Panda,et al. Virtual machine aware communication libraries for high performance computing , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[4] Corporate The MPI Forum,et al. MPI: a message passing interface , 1993, Supercomputing '93.

[5] Xiaowei Yang,et al. High performance network virtualization with SR-IOV , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[6] Dhabaleswar K. Panda,et al. High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[7] Dhabaleswar K. Panda,et al. Nomad: migrating OS-bypass networks in virtual machines , 2007, VEE '07.

[8] Tal Garfinkel,et al. Virtual machine monitors: current technology and future trends , 2005, Computer.

[9] Dhabaleswar K. Panda,et al. A case for high performance computing with virtual machines , 2006, ICS '06.

[10] Srihari Makineni,et al. Characterization of network processing overheads in Xen , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[11] Dmitry Pekurovsky,et al. P3DFFT: A Framework for Parallel Computations of Fourier Transforms in Three Dimensions , 2012, SIAM J. Sci. Comput..

[12] Steve Plimpton,et al. Fast parallel algorithms for short-range molecular dynamics , 1993 .

[13] Jiuxing Liu. Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14] Paul Lu,et al. Shared-memory optimizations for virtual machines , 2011 .

[15] Jack Dongarra,et al. Numerical Linear Algebra for High-Performance Computers , 1998 .

[16] Jian Li,et al. Adaptive and Scalable Optimizations for High Performance SR-IOV , 2012, 2012 IEEE International Conference on Cluster Computing.

[17] Forum Mpi. MPI: A Message-Passing Interface , 1994 .