Can Inter-VM Shmem Benefit MPI Applications on SR-IOV Based Virtualized Infiniband Clusters?

Single Root I/O Virtualization (SR-IOV) technology has been introduced for high-performance interconnects such as InfiniBand. Recent studies mainly focus on performance characteristics of high-performance communication middleware (e.g. MPI) and applications on SR-IOV enabled HPC clusters. However, current SR-IOV based MPI applications do not take advantage of the locality-aware communication on intra-host inter-VM environment. Although Inter-VM Shared Memory (IVShmem) has been proven to support efficient locality-aware communication, the performance benefits of IVShmem for MPI libraries on virtualized environments are yet to be explored. In this paper, we present a comprehensive performance evaluation for IVShmem backed MPI using micro-benchmarks and HPC applications. The performance evaluations show that, through IVShmem, the performance of MPI point-to-point and collective operations can be improved up to 193% and 91%, respectively. The application performance can be improved up to 96%, compared to SR-IOV. The results further show that IVShmem just brings minor overhead compared to native environment.

[1]  Dhabaleswar K. Panda,et al.  SR-IOV Support for Virtualization on InfiniBand Clusters: Early Experience , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[2]  Willy Zwaenepoel,et al.  Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[3]  Dhabaleswar K. Panda,et al.  Virtual machine aware communication libraries for high performance computing , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[4]  Corporate The MPI Forum,et al.  MPI: a message passing interface , 1993, Supercomputing '93.

[5]  Xiaowei Yang,et al.  High performance network virtualization with SR-IOV , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[6]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[7]  Dhabaleswar K. Panda,et al.  Nomad: migrating OS-bypass networks in virtual machines , 2007, VEE '07.

[8]  Tal Garfinkel,et al.  Virtual machine monitors: current technology and future trends , 2005, Computer.

[9]  Dhabaleswar K. Panda,et al.  A case for high performance computing with virtual machines , 2006, ICS '06.

[10]  Srihari Makineni,et al.  Characterization of network processing overheads in Xen , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[11]  Dmitry Pekurovsky,et al.  P3DFFT: A Framework for Parallel Computations of Fourier Transforms in Three Dimensions , 2012, SIAM J. Sci. Comput..

[12]  Steve Plimpton,et al.  Fast parallel algorithms for short-range molecular dynamics , 1993 .

[13]  Jiuxing Liu Evaluating standard-based self-virtualizing devices: A performance study on 10 GbE NICs with SR-IOV support , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14]  Paul Lu,et al.  Shared-memory optimizations for virtual machines , 2011 .

[15]  Jack Dongarra,et al.  Numerical Linear Algebra for High-Performance Computers , 1998 .

[16]  Jian Li,et al.  Adaptive and Scalable Optimizations for High Performance SR-IOV , 2012, 2012 IEEE International Conference on Cluster Computing.

[17]  Forum Mpi MPI: A Message-Passing Interface , 1994 .