A MapReduce Computing Framework Based on GPU Cluster

In recent years, GPU has become a power-efficient device for high performance computing and is widely used in highly parallel application. Its hierarchy of threads and memory has been proven successful for large scale multithread applications. However, how to efficiently program on GPU so as to fully utilize the computing power of GPUs is still a main problem for those potential users. We designed and implemented a new parallel GPU programming framework based on MapReduce. In our framework, a distributed file system (GlusterFS) was employed to store data distributely. The aim of the framework is to improve the efficiency, transparence and scalability of high performance computing on GPU clusters. The dynamic load balancing was taken into consideration more specifically. How typical tasks in oil industry are modified to fit into the framework was demonstrated. Prestack Kirchhoff time migration (PKTM) of seismic data was tested which achieved good acceleration performance.

[1]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[2]  Jarek Nieplocha,et al.  Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit , 2006, Int. J. High Perform. Comput. Appl..

[3]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[4]  Naga K. Govindaraju,et al.  Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[5]  John D. Owens,et al.  Multi-GPU MapReduce on GPU Clusters , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[6]  Chao-Tung Yang,et al.  Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters , 2011, Comput. Phys. Commun..