LIOProf : Exposing Lustre File System Behavior for I / O Middleware

As parallel I/O subsystem in large-scale supercomputers is becoming complex due to multiple levels of software libraries, hardware layers, and various I/O patterns, detecting performance bottlenecks is a critical requirement. While there exist a few tools to characterize application I/O, robust analysis of file system behavior and associating file-system feedback with application I/O patterns are largely missing. Toward filling this void, we introduce Lustre IO Profiler, called LIOProf, for monitoring the I/O behavior and for characterizing the I/O activity statistics in the Lustre file system. In this paper, we use LIOProf for uncovering pitfalls of both MPI-IO’s collective read operation over Lustre file system and identifying HDF5 overhead. Based on LIOProf characterization, we have implemented a Lustre-specific MPI-IO collective read algorithm, enabled HDF5 collective metadata operations and applied HDF5 datasets optimization. Our evaluation results on two Cray systems (Cori at NERSC and Blue Waters at NCSA) demonstrate the efficiency of our optimization efforts.

[1]  Michael E. Papka,et al.  Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[2]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[3]  John Bent,et al.  PLFS: a checkpoint filesystem for parallel applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[4]  E. Lusk,et al.  An abstract-device interface for implementing portable parallel-I/O interfaces , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[5]  John Shalf,et al.  Using IOR to analyze the I/O Performance for HPC Platforms , 2007 .

[6]  L. Berkeley Deploying Server-side File System Monitoring at NERSC , 2009 .

[7]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[8]  Thomas Leibovici,et al.  Taking back control of HPC file systems with Robinhood Policy Engine , 2015, ArXiv.

[9]  Robert Latham,et al.  24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[10]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.