MPI on BlueGene/L: Designing an Efficient General Purpose Messaging Solution for a Large Cellular System

The BlueGene/L computer uses system-on-a-chip integration and a highly scalable 65,536-node cellular architecture to deliver 360 Tflops of peak computing power. Efficient operation of the machine requires a fast, scalable, and standards compliant MPI library. In this paper, we discuss our efforts to port the MPICH2 library to BlueGene/L.

[1]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[2]  D. Culler,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[3]  A. Chien,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[4]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[5]  R. Brightwell,et al.  Design and implementation of MPI on Puma portals , 1996, Proceedings. Second MPI Developer's Conference.

[6]  Giovanni Chiola,et al.  GAMMA: A low-cost network of workstations based on active messages , 1997, PDP.

[7]  D. Culler,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[8]  Luiz De Rose The Hardware Performance Monitor Toolkit , 2001, Euro-Par.

[9]  Carsten Franke,et al.  Job Scheduling Strategies for Parallel Processing , 2002, Lecture Notes in Computer Science.

[10]  Burkhard D. Steinmacher-Burow,et al.  Cellular supercomputing with system-on-a-chip , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[11]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[12]  Luis Ceze,et al.  Full Circle: Simulating Linux Clusters on Linux Clusters , 2003 .

[13]  James L. Peterson,et al.  Design and validation of a performance and power simulator for PowerPC systems , 2003, IBM J. Res. Dev..

[14]  José E. Moreira,et al.  Obtaining Hardware Performance Metrics for the BlueGene/L Supercomputer , 2003, Euro-Par.