Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems

There is growing concern that I/O systems will be hard pressed to satisfy the requirements of future leadership-class machines. Even current machines are found to be I/O bound for some applications. In this paper, we identify existing performance bottlenecks in data movement for I/O on the IBM Blue Gene/P (BG/P) supercomputer currently deployed at several leadership computing facilities. We improve the I/O performance by exploiting the network topology of BG/P for collective I/O, leveraging data semantics of applications and incorporating asynchronous data staging. We demonstrate the efficacy of our approaches for synthetic benchmark experiments and for application-level benchmarks at scale on leadership computing systems.

[1]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[2]  Marianne Winslett,et al.  Improving MPI-IO output performance with active buffering plus threads , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[3]  Ramanan Sankaran,et al.  Three-dimensional direct numerical simulation of a turbulent lifted hydrogen jet flame in heated coflow: flame stabilization and structure , 2009, Journal of Fluid Mechanics.

[4]  T. Inglett,et al.  Designing a Highly-Scalable Operating System: The Blue Gene/L Story , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[5]  B. Fryxell,et al.  FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes , 2000 .

[6]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[7]  Robert B. Ross,et al.  Accelerating I/O Forwarding in IBM Blue Gene/P Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Karsten Schwan,et al.  Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[9]  Onkar Sahni,et al.  Scalable parallel I/O alternatives for massively parallel partitioned solver systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[10]  Wei-keng Liao,et al.  Scaling parallel I/O performance through I/O delegate and caching system , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  Wei-keng Liao,et al.  A case study for scientific I/O: improving the FLASH astrophysics code , 2012 .

[12]  Scott Klasky,et al.  Enabling high-speed asynchronous data extraction and transfer using DART , 2010 .

[13]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC.

[14]  Vivek Sarkar,et al.  Software challenges in extreme scale systems , 2009 .

[15]  Onkar Sahni,et al.  Scalable implicit finite element solver for massively parallel processing with demonstration to 160K cores , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[16]  John Shalf,et al.  The International Exascale Software Project roadmap , 2011, Int. J. High Perform. Comput. Appl..

[17]  Scott Klasky,et al.  Enabling high-speed asynchronous data extraction and transfer using DART , 2010, Concurr. Comput. Pract. Exp..