Lightweight Memory Management for High Performance Applications in Consolidated Environments

Linux-based operating systems and runtimes (OS/Rs) have emerged as the environments of choice for the majority of HPC systems. While Linux-based OS/Rs have advantages such as extensive feature sets and developer familiarity, these features come at the cost of additional system overhead. In contrast to Linux, there is a substantial history of work in the HPC community focused on lightweight OS/Rs that provide scalable and consistent performance for HPC applications, but lack many of the features offered by commodity OS/Rs. In this paper, we propose to bridge the gap between LWKs and commodity OS/Rs by selectively providing a lightweight memory subsystem for HPC applications in a commodity OS/R where concurrently executing a diverse range of workloads is commonplace. Our system HPMMAP provides lightweight memory performance transparently to HPC applications by bypassing Linux's memory management layer. Using HPMMAP, HPC applications achieve consistent performance while the same local compute nodes execute competing workloads likely to be found in HPC clusters and “in-situ” workload deployments. Our approach is dynamically configurable at runtime, and requires no resources when not in use. We show that HPMMAP can decrease variance and reduce application runtime by up to 50 percent when executing a co-located competing commodity workload.

[1]  Brian Kocoloski,et al.  Better than native: using virtualization to improve compute node performance , 2012, ROSS '12.

[2]  Kwan-Liu Ma,et al.  In-situ processing and visualization for ultrascale simulations , 2007 .

[3]  David E. Bernholdt,et al.  Hobbes: composition and virtualization as the foundations of an extreme-scale OS/R , 2013, ROSS '13.

[4]  Keith D. Underwood,et al.  A performance comparison of Linux and a lightweight kernel , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[5]  Yoonho Park,et al.  FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[6]  Ron Brightwell,et al.  Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[7]  Mark Giampapa,et al.  Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Kai Lu,et al.  The TianHe-1A Supercomputer: Its Hardware and Software , 2011, Journal of Computer Science and Technology.

[9]  Kamil Iskra,et al.  Characterizing the Performance of “Big Memory” on Blue Gene Linux , 2009, 2009 International Conference on Parallel Processing Workshops.

[10]  Sameer Kumar,et al.  Evaluating the effect of replacing CNK with linux on the compute-nodes of blue gene/l , 2008, ICS '08.

[11]  Dejan S. Milojicic,et al.  HPC-Aware VM Placement in Infrastructure Clouds , 2013, 2013 IEEE International Conference on Cloud Engineering (IC2E).

[12]  Samuel Thibault,et al.  Improving performance by embedding HPC applications in lightweight Xen domains , 2008, HPCVirt '08.

[13]  Peter A. Dinda,et al.  Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14]  Ann C. Gentile,et al.  Resource monitoring and management with OVIS to enable HPC in cloud computing environments , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[15]  M. Prange,et al.  Scientific Computing in the Cloud , 2008, Computing in Science & Engineering.

[16]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[17]  Dario Pompili,et al.  Energy-Aware Application-Centric VM Allocation for HPC Workloads , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[18]  Brian Kocoloski,et al.  A case for dual stack virtualization: consolidating HPC and commodity applications in the cloud , 2012, SoCC '12.

[19]  Rolf Riesen,et al.  mOS: an architecture for extreme-scale operating systems , 2014, ROSS@ICS.

[20]  Courtenay T. Vaughan,et al.  Application Performance under Different XT Operating Systems. , 2008 .

[21]  Mateo Valero,et al.  Evaluating the Impact of TLB Misses on Future HPC Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[22]  Peter Desnoyers,et al.  Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines , 2013, FAST.

[23]  Laxmikant V. Kalé,et al.  Optimizing VM placement for HPC in the cloud , 2012, FederatedClouds '12.