Catamount Software Architecture with Dual Core Extensions

Catamount is the light weight kernel operating system running on the compute nodes of Cray XT3 systems. It is designed to be a low overhead operating system for a parallel computing environment. Functionality is limited to the minimum set needed to run a scientific computation. The design choices and implementations will be presented. This paper is a reprise of the CUG 2005 paper, but includes a discussion of how dual-core support was added to the software in the fall/winter of 2005.

[1]  Ron Brightwell,et al.  Architectural specification for massively parallel computers: an experience and measurement‐based approach , 2003, Concurr. Pract. Exp..

[2]  Rolf Riesen,et al.  SUNMOS for the Intel Paragon - a brief user`s guide , 1994 .

[3]  Rolf Riesen,et al.  Portals 3.0: protocol building blocks for low overhead communication , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[4]  David Scott,et al.  A TeraFLOP supercomputer in 1996: the ASCI TFLOP system , 1996, Proceedings of International Conference on Parallel Processing.

[5]  R. Brightwell,et al.  A System Software Architecture for High End Computing , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[6]  J. Fier,et al.  Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[7]  Scott Pakin,et al.  Identifying and Eliminating the Performance Variability on the ASCI Q Machine , 2003 .