A case for dual stack virtualization: consolidating HPC and commodity applications in the cloud

With the growth of Infrastructure as a Service (IaaS) cloud providers, many have begun to seriously consider cloud services as a substrate for HPC applications. While the cloud promises many benefits for the HPC community, it currently does not come without drawbacks for application performance. These performance issues are generally the result of resource contention as multiple VMs compete for the same hardware. This contention culminates in cross VM interference whereby one VM is able to impact the performance of another. For HPC applications this interference can have a dramatic impact on scalability and performance. In order to fully support HPC applications in the cloud, services need to be available that prevent cross VM interference and isolate HPC workloads from other users. As a means to achieve this goal, we propose a dual stack approach to IaaS cloud services that utilizes multiple concurrent VMMs on each node capable of partitioning local resources in order to provide performance isolation. Each partition can then be managed by a specialized VMM that is designed specifically for either an HPC or commodity environment. In this paper we demonstrate the use of the Palacios VMM, a virtual machine monitor specifically designed for HPC, in concert with KVM to provide a partitioned cloud platform that is capable of hosting both commodity and HPC applications on a single node without interference. Furthermore, our results demonstrate that running KVM and Palacios in parallel allows an HPC application to achieve isolated and scalable performance while sharing hardware resources with commodity VMs.

[1]  Vladimir Stantchev,et al.  Performance Evaluation of Cloud Computing Offerings , 2009, 2009 Third International Conference on Advanced Engineering Computing and Applications in Sciences.

[2]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language (Version 2.6) , 1993 .

[3]  Jennifer Rexford,et al.  Floodless in seattle: a scalable ethernet architecture for large enterprises , 2008, SIGCOMM '08.

[4]  Peter A. Dinda,et al.  Minimal-overhead virtualization of a large scale supercomputer , 2011, VEE '11.

[5]  Shujia Zhou,et al.  Case study for running HPC applications in public clouds , 2010, HPDC '10.

[6]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[7]  Amin Vahdat,et al.  Switching the optical divide: fundamental challenges for hybrid electrical/optical datacenter networks , 2011, SoCC.

[8]  Peter A. Dinda,et al.  An Introduction to the Palacios Virtual Machine Monitor—Release 1.0 , 2008 .

[9]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[10]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[11]  Alex Landau,et al.  ELI: bare-metal performance for I/O virtualization , 2012, ASPLOS XVII.

[12]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[13]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[14]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[15]  Peter A. Dinda,et al.  VNET/P: bridging the cloud and high performance computing through fast overlay networking , 2012, HPDC '12.

[16]  Brian Kocoloski,et al.  Better than native: using virtualization to improve compute node performance , 2012, ROSS '12.

[17]  Ann C. Gentile,et al.  Resource monitoring and management with OVIS to enable HPC in cloud computing environments , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[18]  M. Prange,et al.  Scientific Computing in the Cloud , 2008, Computing in Science & Engineering.