A Research of MapReduce with GPU Acceleration

MapReduce is an efficient distributed computing model on large data sets. The data processing is fully distributed on huge amount of nodes, and a MapReduce cluster is of highly scalable. However, single-node performance is gradually to be a bottleneck in computeintensive jobs, which makes it difficult to extend the MapReduce model to wider application fields such as largescale image processing and image mining. As an attempt, this paper presents an approach of GPU-accelerated MapReduce, which is implemented by Hadoop and OpenCL. Being a distinctive feature, it aims at general and inexpensive hardware platform, and it is seamlessly integrated with Apache Hadoop, the most widely used MapReduce framework. As a heterogeneous multi-machine and many-core architecture, it targets at both dataand compute-intensive applications. An almost 2 times performance improvement has been validated, without any special optimization.

[1]  Uri C. Weiser,et al.  Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors , 2006, IEEE Computer Architecture Letters.

[2]  Michael J. Flynn,et al.  Finding Speedup in Parallel Processors , 2008, 2008 International Symposium on Parallel and Distributed Computing.

[3]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[4]  Dean M. Tullsen,et al.  Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling , 2005, ISCA 2005.

[5]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Naga K. Govindaraju,et al.  Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[8]  Guy L. Steele,et al.  Data Optimization: Allocation of Arrays to Reduce Communication on SIMD Machines , 1990, J. Parallel Distributed Comput..

[9]  W. Luk,et al.  Axel: a heterogeneous cluster with FPGAs and GPUs , 2010, FPGA '10.

[10]  Satoshi Matsuoka,et al.  Massive supercomputing coping with heterogeneity of modern accelerators , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[11]  Wenguang Chen,et al.  MapCG: Writing parallel program portable between CPU and GPU , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[12]  Dean M. Tullsen,et al.  Interconnections in multi-core architectures: understanding mechanisms, overheads and scaling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[13]  Chau-Wen Tseng,et al.  Improving data locality with loop transformations , 1996, TOPL.

[14]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[15]  C. Lynch Big data: How do your data grow? , 2008, Nature.