Performance implications of virtualizing multicore cluster machines

High performance computers are typified by cluster machines constructed from multicore nodes and using high performance interconnects like Infiniband. Virtualizing such 'capacity computing' platforms implies the shared use of not only the nodes and node cores, but also of the cluster interconnect (e.g., Infiniband). This paper presents a detailed study of the implications of sharing these resources, using the Xen hypervisor to virtualize platform nodes and exploiting Infiniband's native hardware support for its simultaneous use by multiple virtual machines. Measurements are conducted with multiple VMs deployed per node, using modern techniques for hypervisor bypass for high performance network access, and evaluating the implications of resource sharing with different patterns of application behavior. Results indicate that multiple applications can share the cluster's multicore nodes without undue effects on the performance of Infiniband access and use. Higher degrees of sharing are possible with communication-conscious VM placement and scheduling.

[1]  David E. Irwin,et al.  Dynamic virtual clusters in a grid site manager , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[2]  Jing Xu,et al.  On the Use of Fuzzy Modeling in Virtualized Data Center Management , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[3]  Dhabaleswar K. Panda,et al.  High Performance RDMA-Based MPI Implementation over InfiniBand , 2003, ICS '03.

[4]  Sayantan Sur,et al.  Performance Analysis and Evaluation of Mellanox ConnectX InfiniBand Architecture with Multi-Core Platforms , 2007, 15th Annual IEEE Symposium on High-Performance Interconnects (HOTI 2007).

[5]  Chandra Krintz,et al.  Paravirtualization for HPC Systems , 2006, ISPA Workshops.

[6]  Dhabaleswar K. Panda,et al.  A case for high performance computing with virtual machines , 2006, ICS '06.

[7]  Chandra Krintz,et al.  Evaluating the Performance Impact of Xen on MPI and Process Execution For HPC Systems , 2006, First International Workshop on Virtualization Technology in Distributed Computing (VTDC 2006).

[8]  Peter A. Dinda,et al.  VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[9]  Stephen L. Scott,et al.  Xen-OSCAR for Cluster Virtualization , 2006, ISPA Workshops.

[10]  Mark S. Squillante,et al.  Modeling and analysis of dynamic coscheduling in parallel and distributed environments , 2002, SIGMETRICS '02.

[11]  Geoffroy Vallée,et al.  Checkpoint/Restart of Virtual Machines Based on Xen , 2006 .

[12]  Lamia YouseffRich Evaluating the Performance Impact of Xen on MPI and Process Execution For HPC Systems , 2006 .

[13]  Sushmitha P. Kini,et al.  Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[14]  Amin Vahdat,et al.  Dynamic Scheduling of Virtual Machines Running HPC Workloads in Scientific Grids , 2007, 2009 3rd International Conference on New Technologies, Mobility and Security.

[15]  Renato J. O. Figueiredo,et al.  Science gateways made easy: the In-VIGO approach , 2007, Concurr. Comput. Pract. Exp..

[16]  Dhabaleswar K. Panda,et al.  Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[17]  Xuxian Jiang,et al.  Virtual distributed environments in a shared infrastructure , 2005, Computer.

[18]  Peter A. Dinda,et al.  Towards Virtual Networks for Virtual Machine Grid Computing , 2004, Virtual Machine Research and Technology Symposium.

[19]  Jeffrey S. Vetter,et al.  Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[20]  Xiaomin Zhu,et al.  From virtualized resources to virtual computing grids: the In-VIGO system , 2005, Future Gener. Comput. Syst..

[21]  Renato J. O. Figueiredo,et al.  A case for grid computing on virtual machines , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[22]  Miron Livny,et al.  Scheduling Mixed Workloads in Multi-grids: The Grid Execution Hierarchy , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[23]  Christian Engelmann,et al.  Configurable Virtualized System Environments for High Performance Computing , 2007 .

[24]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[25]  Renato J. O. Figueiredo,et al.  Distributed File System Virtualization Techniques Supporting On-Demand Virtual Machine Environments for Grid Computing , 2006, Cluster Computing.