Linux vs. lightweight multi-kernels for high performance computing: experiences at pre-exascale

The long standing consensus in the High-Performance Computing (HPC) Operating Systems (OS) community is that lightweight kernel (LWK) based OSes have the potential to outperform Linux at extreme scale. To explore if LWKs live up to their expectation we developed IHK/McKernel, a lightweight multi-kernel OS designed for HPC, and deployed it on two high-end supercomputers to compare its performance against Linux. Oakforest-PACS, an Intel Xeon Phi (x86) based supercomputer, runs a moderately tuned Linux distribution. Fugaku, the world's fastest supercomputer at the time of writing this paper, is based on Fujitsu's A64FX (aarch64) CPU that runs a highly tuned Linux environment. We discuss recent developments in our OS and provide a detailed description on the challenges of tuning Fugaku's Linux for high-end HPC. While in a moderately tuned environment McKernel significantly outperforms Linux (by up to approximately 2X), on Fugaku we observe an average of 4% speedup across all our experiments, with a few exceptions where the LWK outperforms Linux by up to 29%. As part of our evaluation we also disclose a full scale (158,976 compute nodes) noise profile of the Fugaku system.

[1]  Y. Kodama,et al.  Co-Design for A64FX Manycore Processor and ”Fugaku” , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Kouichi Hirai,et al.  K Computer , 2019, Operating Systems for Supercomputers and High Performance Computing.

[3]  Rolf Riesen,et al.  Operating Systems for Supercomputers and High Performance Computing , 2019, High-Performance Computing Series.

[4]  Tjerk P. Straatsma,et al.  A Fast Scalable Implicit Solver for Nonlinear Time-Evolution Earthquake City Problem on Low-Ordered Unstructured Finite Elements with Artificial Intelligence and Transprecision Computing , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Yutaka Ishikawa,et al.  PicoDriver: fast-path device drivers for multi-kernel operating systems , 2018, HPDC.

[6]  Taisuke Boku,et al.  Performance and Scalability of Lightweight Multi-kernel Based Operating Systems , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[7]  Rolf Riesen,et al.  Toward Full Specialization of the HPC Software Stack: Reconciling Application Containers and Lightweight Multi-kernels , 2017, ROSS@HPDC.

[8]  Hermann Härtig,et al.  Decoupled: Low-Effort Noise-Free Execution on Commodity Systems , 2016, ROSS@HPDC.

[9]  Yutaka Ishikawa,et al.  On the Scalability, Performance Isolation and Device Driver Transparency of the IHK/McKernel Hybrid Lightweight Kernel , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[10]  Rolf Riesen,et al.  Exploring the Design Space of Combining Linux with Lightweight Kernels for Extreme Scale Computing , 2015, ROSS@HPDC.

[11]  Rolf Riesen,et al.  What is a Lightweight Kernel? , 2015, ROSS@HPDC.

[12]  Kevin T. Pedretti,et al.  Achieving Performance Isolation with Lightweight Co-Kernels , 2015, HPDC.

[13]  Maya Gokhale,et al.  A Container-Based Approach to OS Specialization for Exascale Computing , 2015, 2015 IEEE International Conference on Cloud Engineering.

[14]  Yutaka Ishikawa,et al.  Interface for heterogeneous kernels: A framework to enable hybrid OS designs targeting high performance computing on manycore architectures , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[15]  Rolf Riesen,et al.  mOS: an architecture for extreme-scale operating systems , 2014, ROSS@ICS.

[16]  Ian Karlin,et al.  LULESH 2.0 Updates and Changes , 2013 .

[17]  David E. Bernholdt,et al.  Hobbes: composition and virtualization as the foundations of an extreme-scale OS/R , 2013, ROSS '13.

[18]  Yutaka Ishikawa,et al.  Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[19]  Alexander Aiken,et al.  Legion: Expressing locality and independence with logical regions , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[20]  Yoonho Park,et al.  FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment , 2012, 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing.

[21]  D. Roweth,et al.  Leveraging the Cray Linux Environment Core Specialization Feature to Realize MPI Asynchronous Progress on Cray XE Systems , 2012 .

[22]  Mark Giampapa,et al.  Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[23]  Torsten Hoefler,et al.  Characterizing the Influence of System Noise on Large-Scale Applications by Simulation , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Suzanne M. Kelly,et al.  LDRD Final Report: A Lightweight Operating System for Multi-core Capability Class Supercomputers , 2010 .

[25]  Peter A. Dinda,et al.  Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[26]  Don E Maxwell,et al.  Reducing Application Runtime Variability on Jaguar XT5 , 2010 .

[27]  Kamil Iskra,et al.  Characterizing the Performance of “Big Memory” on Blue Gene Linux , 2009, 2009 International Conference on Parallel Processing Workshops.

[28]  Theodora Varvarigou,et al.  Service selection and workflow mapping for Grids: an approach exploiting quality-of-service information , 2009 .

[29]  Rolf Riesen,et al.  CONCURRENCY AND COMPUTATION : PRACTICE AND EXPERIENCE Concurrency Computat , 2008 .

[30]  Ron Brightwell,et al.  Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[31]  Kevin T. Pedretti,et al.  SMARTMAP: Operating system support for efficient data sharing among processes on a multi-core processor , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[32]  T. Inglett,et al.  Designing a Highly-Scalable Operating System: The Blue Gene/L Story , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[33]  Dan Tsafrir,et al.  System noise, OS clock ticks, and fine-grained parallel applications , 2005, ICS '05.

[34]  Suzanne M. Kelly,et al.  Software Architecture of the Light Weight Kernel, Catamount , 2005 .

[35]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[36]  K. Nakajima Parallel Iterative Solvers of GeoFEM with Selective Blocking Preconditioning for Nonlinear Contact Problems on the Earth Simulator , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[37]  Keith D. Underwood,et al.  A performance comparison of Linux and a lightweight kernel , 2003, 2003 Proceedings IEEE International Conference on Cluster Computing.

[38]  V. E. Henson,et al.  BoomerAMG: a parallel algebraic multigrid solver and preconditioner , 2002 .

[39]  Rolf Riesen,et al.  PUMA: an operating system for massively parallel systems , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[40]  Subhash Saini,et al.  Applications performance under OSF/1 AD and SUNMOS on Intel Paragon XP/S-15 , 1994, Proceedings of Supercomputing '94.

[41]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.