uMMAP-IO: User-Level Memory-Mapped I/O for HPC

The integration of local storage technologies alongside traditional parallel file systems on HPC clusters, is expected to rise the programming complexity on scientific applications aiming to take advantage of the increased-level of heterogeneity. In this work, we present uMMAP-IO, a user-level memory-mapped I/O implementation that simplifies data management on multi-tier storage subsystems. Compared to the memory-mapped I/O mechanism of the OS, our approach features per-allocation configurable settings (e.g., segment size) and transparently enables access to a diverse range of memory and storage technologies, such as the burst buffer I/O accelerators. Preliminary results indicate that uMMAP-IO provides at least 5-10x better performance on representative workloads in comparison with the standard memory-mapped I/O of the OS, and approximately 20-50% degradation on average compared to using conventional memory allocations without storage support up to 8192 processes.

[1]  Maya Gokhale,et al.  DI-MMAP: A High Performance Memory-Map Runtime for Data-Intensive Applications , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[2]  Sai Narasimhamurthy,et al.  MPI windows on storage for HPC applications , 2017, EuroMPI/USA.

[3]  Marianne Winslett,et al.  A Multiplatform Study of I/O Behavior on Petascale Supercomputers , 2015, HPDC.

[4]  Dinshaw S. Balsara,et al.  Resilient computational applications using Coarray Fortran , 2019, Parallel Comput..

[5]  Sai Narasimhamurthy,et al.  Persistent coarrays: integrating MPI storage windows in coarray fortran , 2019, EuroMPI.

[6]  Timothy G. Mattson,et al.  The Parallel Research Kernels , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[7]  Vladimir Mironov,et al.  Evaluation of Intel Memory Drive Technology Performance for Scientific Applications , 2018, MCHPC@SC.

[8]  Torsten Hoefler,et al.  Using Advanced MPI: Modern Features of the Message-Passing Interface , 2014 .

[9]  Robert Latham,et al.  Understanding and improving computational science storage access through continuous characterization , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[10]  Gokcen Kestor,et al.  MPI Streams for HPC Applications , 2017, ArXiv.

[11]  Andrew Warfield,et al.  Non-volatile storage , 2016, Commun. ACM.

[12]  Marco Cesati,et al.  Understanding the Linux Kernel - from I / O ports to process management: covers Linux Kernel version 2.4 (2. ed.) , 2005 .

[13]  Daniel G. Waddington,et al.  Software challenges for the changing storage landscape , 2018, Commun. ACM.

[14]  Xiao Liu,et al.  Basic Performance Measurements of the Intel Optane DC Persistent Memory Module , 2019, ArXiv.

[15]  Rajeev Thakur,et al.  Users guide for ROMIO: A high-performance, portable MPI-IO implementation , 1997 .

[16]  Jean C. Walrand,et al.  Analysis and comparison of TCP Reno and Vegas , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[17]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[18]  Feng Li,et al.  Userland CO-PAGER: boosting data-intensive applications with non-volatile memory, userspace paging , 2019 .

[19]  John Shalf,et al.  Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[20]  Benoit Hudzia,et al.  Pre-Copy and Post-Copy VM Live Migration for Memory Intensive Applications , 2012, Euro-Par Workshops.

[21]  Sai Narasimhamurthy,et al.  SAGE: Percipient Storage for Exascale Data Centric Computing , 2018, Parallel Comput..

[22]  Brian van Straalen,et al.  Scientific Workflows at DataWarp-Speed: Accelerated Data-Intensive Science Using NERSC's Burst Buffer , 2016, 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS).