Blue Gene/L programming and operating environment

With up to 65,536 compute nodes and a peak performance of more than 360 teraflops, the Blue Gene®/L (BG/L) supercomputer represents a new level of massively parallel systems. The system software stack for BG/L creates a programming and operating environment that harnesses the raw power of this architecture with great effectiveness. The design and implementation of this environment followed three major principles: simplicity, performance, and familiarity. By specializing the services provided by each component of the system architecture, we were able to keep each one simple and leverage the BG/L hardware features to deliver high performance to applications. We also implemented standard programming interfaces and programming languages that greatly simplified the job of porting applications to BG/L. The effectiveness of our approach has been demonstrated by the operational success of several prototype and production machines, which have already been scaled to 16,384 nodes.

[1]  Rolf Riesen,et al.  PUMA: an operating system for massively parallel systems , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[2]  R. Brightwell,et al.  A System Software Architecture for High End Computing , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[3]  Hans Werner Meuer,et al.  Top500 Supercomputer Sites , 1997 .

[4]  José E. Moreira,et al.  Job Scheduling for the BlueGene/L System (Research Note) , 2002, Euro-Par.

[5]  William Gropp,et al.  MPI on BlueGene/L: Designing an Efficient General Purpose Messaging Solution for a Large Cellular System , 2003, PVM/MPI.

[6]  Philip Heidelberger,et al.  Architecture and Performance of the BlueGene/L Message Layer , 2004, PVM/MPI.

[7]  John A. Gunnels,et al.  A high-performance SIMD floating point unit for BlueGene/L: architecture, compilation, and algorithm design , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[8]  Wei Cai,et al.  Scalable Line Dynamics in ParaDiS , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[9]  José E. Moreira,et al.  Unlocking the Performance of the BlueGene/L Supercomputer , 2004, Proceedings of the ACM/IEEE SC2004 Conference.