An MPI-CUDA implementation for the compression of DEM

A high performance terrain data compression method is proposed based on discrete wavelet transform (DWT) and parallel run-length code. But the implementation of the schemes to solve these models in realistic scenarios imposes huge demands of computing power. Compute Unified Device Architecture (CUDA) programmed, Graphic Processing Units (GPUs) are rapidly becoming a major choice in high performance computing. Hence, the number of applications ported to the CUDA platform is growing high. Message Passing Interface (MPI) has been the choice of high performance computing for more than a decade and it has proven its capability in delivering higher performance in parallel applications. CUDA and MPI use different programming approaches but both of them depend on the inherent parallelism of the application to be effective. In this approach, MPI functions as the data distributing mechanism between the GPU nodes and CUDA as the main computing engine. This allows the programmer to connect GPU nodes via high speed Ethernet without special technologies. We tackle the acceleration of the compression of digital elevation models (DEM) by exploiting the combined power of several CUDA-enabled GPUs in a GPU cluster. This implementation overlaps MPI communication with CPU-GPU memory transfers and GPU computation to increase efficiency. Several numerical experiments, performed on a cluster of modern CUDA-enabled GPUs, show the efficiency of the distributed solver. Our speed-up was over 20 compared to two thread CPU version.

[1]  Daniel B. Horn,et al.  Assessment of Graphic Processing Units (GPUs) for Department of Defense (DoD) Digi , 2005 .

[2]  Dirk Ribbrock,et al.  A simulation suite for Lattice-Boltzmann based real-time CFD applications exploiting multi-level parallelism on modern multi- and many-core architectures , 2011, J. Comput. Sci..

[3]  Aslak Tveito,et al.  Numerical solution of partial differential equations on parallel computers , 2006 .

[4]  Martin Rumpf,et al.  Graphics Processor Units: New Prospects for Parallel Computing , 2006 .

[5]  Martin Lilleeng Sætra,et al.  Shallow Water Simulations on Multiple GPUs , 2010, PARA.

[6]  Chin-Chuan Han,et al.  A GPU-Based Simulation of Tsunami Propagation and Inundation , 2009, ICA3PP.

[7]  Gordon Erlebacher,et al.  High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster , 2010, J. Comput. Phys..

[8]  Dimitri Komatitsch,et al.  Fluid–solid coupling on a cluster of GPU graphics cards for seismic wave propagation , 2011 .

[9]  Henri Calandra,et al.  Fast seismic modeling and Reverse Time Migration on a GPU cluster , 2009, 2009 International Conference on High Performance Computing & Simulation.

[11]  Mikhail J. Atallah,et al.  Constructing trees in parallel , 1989, SPAA '89.

[12]  WU Li-xin Mountain Grid DEM Data Compression Based on Wavelet Transform and Mixed Entropy Coding , 2004 .

[13]  Inanc Senocak,et al.  An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters , 2010 .

[14]  Inanc Senocak,et al.  Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms , 2010, The Journal of Supercomputing.

[15]  Menno-Jan Kraak,et al.  Principles of geographic information systems : ITC core module : version 1 : syllabus , 1999 .

[16]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[17]  Frank Mueller,et al.  Data-intensive document clustering on graphics processing unit (GPU) clusters , 2011, J. Parallel Distributed Comput..

[18]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[19]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.