Inverse Space-Filling Curve Partitioning of a Global Ocean Model

In this paper, we describe how inverse space-filling curve partitioning is used to increase the simulation rate of a global ocean model. Space-filling curve partitioning allows for the elimination of load imbalance in the computational grid due to land points. Improved load balance combined with code modifications within the conjugate gradient solver significantly increase the simulation rate of the parallel ocean program at high resolution. The simulation rate for a high resolution model nearly doubled from 4.0 to 7.9 simulated years per day on 28,972 IBM Blue Gene/L processors. We also demonstrate that our techniques increase the simulation rate on 7545 Cray XT3 processors from 6.3 to 8.1 simulated years per day. Our results demonstrate how minor code modifications can have significant impact on resulting performance for very large processor counts.

[1]  Srinivas Aluru,et al.  Parallel domain decomposition and load balancing using space-filling curves , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[2]  Darren J. Kerbyson,et al.  A Performance Model of the Parallel Ocean Program , 2005, Int. J. High Perform. Comput. Appl..

[3]  R.D. Loft,et al.  Terascale Spectral Element Dynamical Core for Atmospheric General Circulation Models , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[4]  Matthew T. O'Keefe,et al.  A Comparison of Data-Parallel and Message-Passing Versions of the Miami Isopycnic Coordinate Ocean Model (MICOM) , 1995, Parallel Comput..

[5]  Jarmo Rantakokko,et al.  A Framework for Partitioning Structured Grids with Inhomogeneous Workload , 1998, Parallel Algorithms Appl..

[6]  John M. Levesque,et al.  Practical performance portability in the Parallel Ocean Program (POP) , 2005, Concurr. Pract. Exp..

[7]  Jarmo Rantakokko An Integrated Decomposition and Partitioning Approach for Irregular Block-Structured Applications , 2000, IPDPS Workshops.

[8]  John M. Dennis,et al.  Partitioning with space-filling curves on the cubed-sphere , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[9]  Philip Heidelberger,et al.  Early Experience with Scientific Applications on the Blue Gene/L Supercomputer , 2005, Euro-Par.

[10]  Michael Griebel,et al.  Parallel multigrid in an adaptive PDE solver based on hashing and space-filling curves , 1999, Parallel Comput..

[11]  Cyril Fonlupt,et al.  Data-Parallel Load Balancing Strategies , 1998, Parallel Comput..

[12]  Victor Eijkhout,et al.  Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiproce , 1999 .

[13]  Ping Wang,et al.  Optimization of a Parallel Ocean General Circulation Model , 1999 .

[14]  Victor Eijkhout,et al.  LAPACK Working Note 56: Reducing Communication Costs in the Conjugate Gradient Algorithm on Distributed Memory Multiprocessors , 1993 .

[15]  Jens Zimmermann,et al.  Parallelizing an Unstructured Grid Generator with a Space-Filling Curve Approach , 2000, Euro-Par.

[16]  M. Maltrud,et al.  An eddy resolving global 1/10° ocean simulation , 2005 .

[17]  Elizabeth R. Jessup,et al.  Applying Automated Memory Analysis to Improve Iterative Algorithms , 2007, SIAM J. Sci. Comput..

[18]  Jesús Labarta,et al.  Performance Modeling of HPC Applications , 2003, PARCO.