Exploring Multi-level Parallelism in Atmospheric Applications

Forecast precisions of climatological models are limited by computing power and time available for the executions. The more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earth's atmosphere can be increased, and consequently the numerical forecasts are more accurate. With the introduction of multi-core processors and GPU boards, computer architectures have many parallel layers. Today, there are parallelism inside a processor, among processors and among computers. In order to best utilize the performance of the computers it is necessary to consider all parallel levels to distribute a concurrent application. However, no parallel programming interface abstracts well these different parallel levels. Based in this context, this work proposes the use of mixed programming interfaces to improve performance to atmospheric models. The parallel execution of simulations shows that the use of GPUs and multi-core CPUs in distributed systems can reduce considerably the execution time of climatological applications.

[1]  Michael Garland,et al.  Parallel Computing Experiences with Cuda the Cuda Programming Model Provides a Straightforward Means of Describing Inherently Parallel Computations, and Nvidia's Tesla Gpu Architecture Delivers High Computational Throughput on Massively Parallel Problems. This Article Surveys Experiences Gained in A , 2008 .

[2]  Rohit Chandra,et al.  Parallel programming in openMP , 2000 .

[3]  Jack Dongarra,et al.  MPI - The Complete Reference: Volume 1, The MPI Core , 1998 .

[4]  Xiaoning Ding,et al.  An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors , 2005, IWOMP.

[5]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[6]  John Christian Linford,et al.  Accelerating Atmospheric Modeling Through Emerging Multi-core Technologies , 2010 .

[7]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[8]  L. Perelman,et al.  A finite-volume, incompressible Navier Stokes model for studies of the ocean on parallel computers , 1997 .

[9]  Manish Vachharajani,et al.  GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[10]  Zvonko G. Vranesic,et al.  Field-Programmable Gate Arrays , 1992 .

[11]  Guido Juckeland,et al.  Performance analysis of multi‐level parallelism: inter‐node, intra‐node and hardware accelerators , 2012, Concurr. Comput. Pract. Exp..

[12]  Adrian Sandu,et al.  Scalable heterogeneous parallelism for atmospheric modeling and simulation , 2010, The Journal of Supercomputing.

[13]  Roni Avissar,et al.  The Ocean-Land-Atmosphere Model (OLAM). Part I: Shallow-Water Tests , 2008 .

[14]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[15]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[16]  Marc Snir,et al.  The MPI core , 1998 .

[17]  Satoshi Matsuoka,et al.  An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  J. Cohen,et al.  Novel Architectures: Solving Computational Problems with GPU Computing , 2009, Computing in Science & Engineering.

[19]  Jack Dongarra,et al.  MPI: The Complete Reference , 1996 .