论文信息 - Route-aware independent MPI I/O on the blue gene/Q

Route-aware independent MPI I/O on the blue gene/Q

Scalable high-performance I/O is crucial for application performance on large-scale systems. With the growing complexity of the system interconnects, it has become important to consider the impact of network contention on I/O performance because the I/O messages traverse several hops in the interconnect before reaching the I/O nodes or the file system. In this work, we present a route-aware and load-aware algorithm to modify existing bridge node assignment in the Blue Gene/Q (BG/Q) supercomputer. We reduce the network contention and reduce the write time by an average of 60% over the default independent I/O and by 20% over collective I/O on up to 8192 nodes on the Mira BG/Q system. Our algorithm routes 1.4x fewer messages through the bridge nodes which connect to the I/O nodes on the BG/Q.

Venkatram Vishwanath | Preeti Malakar | V. Vishwanath | Preeti Malakar

[1] Scott Klasky,et al. Using MPI file caching to improve parallel write performance for large-scale scientific applications , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[2] Rajeev Thakur,et al. Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[3] Andrew J. Hutton,et al. Lustre: Building a File System for 1,000-node Clusters , 2003 .

[4] Robert Latham,et al. Understanding and improving computational science storage access through continuous characterization , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[5] Frank B. Schmuck,et al. GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[6] Ibm Redbooks,et al. IBM System Blue Gene Solution: Blue Gene/P Application Development , 2009 .

[7] Robert Latham,et al. 24/7 Characterization of petascale I/O workloads , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[8] Amith R. Mamidala,et al. Looking under the hood of the IBM Blue Gene/Q network , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[9] Karsten Schwan,et al. Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[10] Leonid Oliker,et al. Parallel I/O performance: From events to ensembles , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[11] Dirk Pleiter,et al. Using GPFS to Manage NVRAM-Based Storage Cache , 2013, ISC.

[12] Robert Latham,et al. Production I / O Characterization on the Cray XE 6 , 2013 .

[13] Philip Heidelberger,et al. The IBM Blue Gene/Q interconnection network and message unit , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[14] Hai Jin,et al. Iteration Based Collective I/O Strategy for Parallel I/O Systems , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[15] Dhabaleswar K. Panda,et al. Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework , 2012, 2012 IEEE International Conference on Cluster Computing.

[16] Philip Heidelberger,et al. Optimization of All-to-All Communication on the Blue Gene/L Supercomputer , 2008, 2008 37th International Conference on Parallel Processing.

[17] Venkatram Vishwanath,et al. Hierarchical Read–Write Optimizations for Scientific Applications with Multi-variable Structured Datasets , 2017, International Journal of Parallel Programming.

[18] Pavan Balaji,et al. Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems , 2011, Computer Science - Research and Development.

[19] Zhe Zhang,et al. Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems , 2011, 30th IEEE International Performance Computing and Communications Conference.

[20] Robert B. Ross,et al. On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[21] Michael Gschwind,et al. The IBM Blue Gene/Q Compute Chip , 2012, IEEE Micro.

[22] Katherine E. Isaacs,et al. There goes the neighborhood: Performance degradation due to nearby jobs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[23] Robert Latham,et al. High performance file I/O for the Blue Gene/L supercomputer , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[24] Rajeev Thakur,et al. LACIO: A New Collective I/O Strategy for Parallel I/O Systems , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[25] Scott Klasky,et al. Terascale direct numerical simulations of turbulent combustion using S3D , 2008 .

[26] Wei-keng Liao,et al. A case study for scientific I/O: improving the FLASH astrophysics code , 2012 .