In this paper, the code for the North Atlantic Princeton Ocean Model (NAPOM) used by the Marine Biology Station (MBS) is parallelized and optimized. The FORTRAN source code and the hardware architecture of MBS cluster are examined and analyzed to determine the behavior of the NAPOM execution with bottlenecks identified on both ends. Based on the analysis, the most effective optimization and parallelization actions are planned. Most time consuming modules of the NAPOM package are optimized to achieve maximal performance on the hardware architecture. The pre-process modules are distributed on more computational nodes while all independent complex operations are parallelized with the shared memory principles. The resulting parallelized implementation of the NAPOM package executes nearly four times faster than the original one with only a minimal additional load to the MBS cluster.
[1]
G. Mellor.
USERS GUIDE for A THREE-DIMENSIONAL, PRIMITIVE EQUATION, NUMERICAL OCEAN MODEL
,
1998
.
[2]
V. M. Kamenkovich,et al.
On the time-splitting scheme used in the Princeton Ocean Model
,
2009,
J. Comput. Phys..
[3]
M. N. Özişik,et al.
Finite Difference Methods in Heat Transfer
,
2017
.
[4]
Roman Trobec,et al.
Computational complexity and parallelization of the meshless local Petrov-Galerkin method
,
2009
.
[5]
Barbara Chapman,et al.
Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)
,
2007
.