On a dynamic scheduling algorithm for massively parallel computations of atomic isotopes

The goal of the work is to enhance the execution scalability of the code called AtomicClusters for evaluations of in-medium properties of nuclear clusters to extreme scales. In order to fully exploit the computing power of massively parallel supercomputing systems, the code was supplemented with parallel output and dynamic scheduling system based on task-stealing technique. The scheduling system was implemented for state-of-the-art distributed-memory high-performance computers (HPC) using the advanced features of Message Passing Interface (MPI) as an independent adjustable module. The parallel output was integrated into the code using MPI IO. A number of strong scaling tests was performed for the resulting parallel software. An almost linear scalability was reached on up to 4000 cores. The code scales up to at least 38400 processes, but with lower speedup. The obtained results are discussed in the fifth section of the paper.

[1]  Dhabaleswar K. Panda,et al.  Designing passive synchronization for MPI-2 one-sided communication to maximize overlap , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[2]  Tanguy Pierog,et al.  High-Level Support Activities of Simulation Laboratory E&A Particles , 2015, HiPC 2015.

[3]  Rajeev Thakur,et al.  Revealing the Performance of MPI RMA Implementations , 2007, PVM/MPI.

[4]  Robert B. Ross,et al.  Using MPI-2: Advanced Features of the Message Passing Interface , 2003, CLUSTER.

[5]  Rhyd Lewis,et al.  A general-purpose hill-climbing method for order independent minimum grouping problems: A case study in graph colouring and bin packing , 2009, Comput. Oper. Res..

[6]  S. Typel,et al.  Relativistic mean field calculations with density dependent meson nucleon coupling , 1999 .

[7]  John L. Gustafson,et al.  Reevaluating Amdahl's law , 1988, CACM.

[8]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[9]  Achim Streit,et al.  Architecture, implementation and parallelization of the software to search for periodic gravitational wave signals , 2014, Comput. Phys. Commun..

[10]  G. Röpke,et al.  Effects of the liquid-gas phase transition and cluster formation on the symmetry energy , 2013, 1309.6934.

[11]  Peter Sanders,et al.  Randomized Receiver Initiated Load-balancing Algorithms for Tree-shaped Computations , 2002, Comput. J..

[12]  S. Kimmel Architecture , 2013, Arsham-isms.

[13]  Udi Manber,et al.  DIB—a distributed implementation of backtracking , 1987, TOPL.

[14]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.