An Atmospheric General Circulation Model with Chemistry for the CRAY T3E: Design, Performance Optimization and Coupling to an Ocean Model

The design, implementation and performance optimization on the CRAY T3E of an atmospheric general circulation model (AGCM) which includes the transport of, and chemical reactions among, an arbitrary number of constituents is reviewed. The parallel implementation is based on a two-dimensional (longitude and latitude) data domain decomposition. Initial optimization efforts centered on minimizing the impact of substantial static and weakly-dynamic load imbalances among processors through load redistribution schemes. Recent optimization efforts have centered on single-node optimization. Strategies employed include loop unrolling, both manually and through the compiler, the use of an optimized assembler-code library for special function calls, and restructuring of parts of the code to improve data locality. Data exchanges and synchronizations involved in coupling different data-distributed models can account for a significant fraction of the running time. Therefore, the required scattering and gathering of data must be optimized. In systems such as the T3E, there is much more aggregate bandwidth in the total system than in any particular processor. This suggests a distributed design. The design and implementation of a such distributed 'Data Broker' as a means to efficiently couple the components of our climate system model is described.