Output Performance Study on a Production Petascale Filesystem

This paper reports our observations from a top-tier supercomputer Titan and its Lustre parallel file stores under production load. In summary, we find that supercomputer file systems are highly variable across the machine at fine time scales. This variability has two major implications. First, stragglers lessen the benefit of coupled I/O parallelism (striping). Peak median output bandwidths are obtained with parallel writes to many independent files, with no striping or write-sharing of files across clients (compute nodes). I/O parallelism is most effective when the application—or its I/O middleware system—distributes the I/O load so that each client writes separate files on multiple targets, and each target stores files for multiple clients, in a balanced way. Second, our results suggest that the potential benefit of dynamic adaptation is limited. In particular, it is not fruitful to attempt to identify “good spots” in the machine or in the file system: component performance is driven by transient load conditions, and past performance is not a useful predictor of future performance. For example, we do not observe regular diurnal load patterns.

[1]  Andrew A. Chien,et al.  Input/Output Characteristics of Scalable Parallel Applications , 1995, SC.

[2]  Scott Klasky,et al.  Characterizing output bottlenecks in a supercomputer , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[3]  Leonid Oliker,et al.  Parallel I/O performance: From events to ensembles , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[4]  Carla Schlatter Ellis,et al.  File-Access Characteristics of Parallel Scientific Workloads , 1996, IEEE Trans. Parallel Distributed Syst..

[5]  Dhabaleswar K. Panda,et al.  Scalable Earthquake Simulation on Petascale Supercomputers , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  Don E Maxwell,et al.  I/O Router Placement and Fine-Grained Routing on Titan to Support Spider II , 2014 .

[7]  Marianne Winslett,et al.  A Multiplatform Study of I/O Behavior on Petascale Supercomputers , 2015, HPDC.

[8]  Karsten Schwan,et al.  Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[9]  Galen M. Shipman,et al.  Workload characterization of a leadership class storage cluster , 2010, 2010 5th Petascale Data Storage Workshop (PDSW '10).

[10]  Galen M. Shipman,et al.  A Next-Generation Parallel File System Environment for the OLCF , 2012 .

[11]  John Shalf,et al.  Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Mark F. Adams,et al.  Gyrokinetic particle simulation of neoclassical transport in the pedestal/scrape-off region of a tokamak plasma , 2006 .

[13]  L. Chacón,et al.  A non-staggered, conservative, V×B=0' finite-volume scheme for 3D implicit extended magnetohydrodynamics in curvilinear geometries , 2004, Comput. Phys. Commun..

[14]  Prithviraj Banerjee,et al.  A study of I/O behavior of Perfect benchmarks on a multiprocessor , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[15]  Galen M. Shipman,et al.  Efficient Object Storage Journaling in a Distributed Parallel File System , 2010, FAST.

[16]  Robert Latham,et al.  Understanding and improving computational science storage access through continuous characterization , 2011, MSST.

[17]  P. Messina,et al.  Architectural requirements of parallel scientific applications with explicit communication , 1993, ISCA '93.

[18]  John Shalf,et al.  Using IOR to analyze the I/O Performance for HPC Platforms , 2007 .

[19]  Robert Latham,et al.  I/O performance challenges at leadership scale , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[20]  Robert Latham,et al.  24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[21]  Gregory R. Ganger,et al.  Generating Representative Synthetic Workloads: An Unsolved Problem , 1995 .

[22]  Scott Klasky,et al.  Predicting Output Performance of a Petascale Supercomputer , 2017, HPDC.