Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P

AbstractUpcoming exascale capable systems are expected to comprise more than a million processing elements. As researchers continue to work toward architecting these systems, it is becoming increasingly clear that these systems will utilize a significant amount of shared hardware between processing units; this includes shared caches, memory and network components. Thus, understanding how effective current message passing and communication infrastructure is in tying these processing elements together, is critical to making educated guesses on what we can expect from such future machines. Thus, in this paper, we characterize the communication performance of the message passing interface (MPI) implementation on 32 racks (131072 cores) of the largest Blue Gene/P (BG/P) system in the United States (80% of the total system size) and reveal various interesting insights into it.

[1]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[2]  Franck Cappello,et al.  MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[3]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[4]  Sushmitha P. Kini,et al.  Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[5]  Dhabaleswar K. Panda,et al.  MPI over InfiniBand: Early Experiences , 2003 .

[6]  Dhabaleswar K. Panda,et al.  Design and implementation of MPICH2 over InfiniBand with RDMA support , 2003, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[7]  Venkatesan Packirisamy,et al.  OpenMP in Multicore Architectures , 2005 .

[8]  Collin McCurdy,et al.  Early evaluation of IBM BlueGene/P , 2008, HiPC 2008.

[9]  Rajeev Thakur,et al.  Non-data-communication Overheads in MPI: Analysis on Blue Gene/P , 2008, PVM/MPI.

[10]  Ibm Blue,et al.  Overview of the IBM Blue Gene/P Project , 2008, IBM J. Res. Dev..

[11]  Philip Heidelberger,et al.  The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer , 2008, ICS '08.

[12]  Rajeev Thakur,et al.  Communication analysis of parallel 3D FFT for flat cartesian meshes on large Blue Gene systems , 2008, HiPC'08.

[13]  Global Arrays , 2011, Encyclopedia of Parallel Computing.