论文信息 - ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include: Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.

[1] Gerd Heber,et al. An overview of the HDF5 technology suite and its applications , 2011, AD '11.

[2] Kesheng Wu,et al. FastBit: An Efficient Indexing Technology For Accelerating Data-Intensive Science , 2005 .

[3] Alex Brooks,et al. Argobots: A Lightweight Low-Level Threading and Tasking Framework , 2018, IEEE Transactions on Parallel and Distributed Systems.

[4] Jianwei Li,et al. Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[5] Scott Klasky,et al. Terascale direct numerical simulations of turbulent combustion using S3D , 2008 .

[6] Surendra Byna,et al. ArrayUDF: User-Defined Scientific Data Analysis on Arrays , 2017, HPDC.

[7] Kesheng Wu,et al. Data Elevator: Low-Contention Data Movement in Hierarchical Storage System , 2016, 2016 IEEE 23rd International Conference on High Performance Computing (HiPC).

[8] Prabhat,et al. ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events , 2016, NIPS.

[9] Karsten Schwan,et al. Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10] Arie Shoshani,et al. Parallel I/O, analysis, and visualization of a trillion particle simulation , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[11] Houjun Tang,et al. ARCHIE: Data Analysis Acceleration with Array Caching in Hierarchical Storage , 2018, 2018 IEEE International Conference on Big Data (Big Data).