论文信息 - Exploiting Computational Resources in Distributed Heterogeneous Platforms

Exploiting Computational Resources in Distributed Heterogeneous Platforms

We have been witnessing a continuous growth of both heterogeneous computational platforms (e.g., Cell blades, or the joint use of traditional CPUs and GPUs) and multi- core processor architecture; and it is still an open question how applications can fully exploit such computational potential efficiently. In this paper we introduce a run-time environment and programming framework which supports the implementation of scalable and efficient parallel applications in such heterogeneous, distributed environments. We assess these issues through well-known kernels and actual applications that behave regularly and irregularly, which are not only relevant but also demanding in terms of computation and I/O. Moreover, the irregularity of these, as well as many other applications poses a challenge to the design and implementation of efficient parallel algorithms. Our experimental environment includes dual and octa-core machines augmented with GPUs and we evaluate our framework performance for standalone and distributed executions. The evaluation on a distributed environment has shown near to linear scale-ups for two data mining applications, while the applications performance, when using CPU and GPU, has been improved into around 25%, compared to the GPU-only versions.

Dorgival O. Guedes | Rafael Sachetto Oliveira | George Teodoro | Renato Ferreira | Daniel Fireman

[1] Yoonho Park,et al. SPC: a distributed, scalable platform for data mining , 2006, DMSSP '06.

[2] Yuan Yu,et al. Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[3] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4] Srinivasan Parthasarathy,et al. New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[5] Naga K. Govindaraju,et al. Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6] Bingsheng He,et al. Mars: Accelerating MapReduce with Graphics Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.

[7] J. Kulpa,et al. Time-frequency analysis using NVIDIA compute unified device architecture (CUDA) , 2009, Symposium on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments (WILGA).

[8] Lúcia Maria de A. Drummond,et al. Anthill: a scalable run-time environment for data mining applications , 2005, 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'05).

[9] J. L. Hodges,et al. Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[10] Wagner Meira,et al. Achieving Multi-Level Parallelism in the Filter-Labeled Stream Programming Model , 2008, 2008 37th International Conference on Parallel Processing.

[11] Teresa H. Y. Meng,et al. Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.