论文信息 - Reengineering for parallelism in heterogeneous parallel platforms

Reengineering for parallelism in heterogeneous parallel platforms

In recent years, parallel programming models have evolved dramatically. While, historically, the main focus of research has been on exploiting multi-core/many-core processors with the expectation of an increasing number of cores per chip, the emergence of increasingly heterogeneous computing has changed the landscape. We have moved from a scenario that is mostly dominated by OpenMP [1] at the node level and by MPI [2] at the cluster level towards a new situation where GPUs and other accelerators are starting to have a pervasive presence in the target parallel platforms. The presence of GPUs in the Top 500 Supercomputer list [3] has also been increasing. In this new scenario, both new and existing applications need to be adapted to deal with different and complex hardware environments. The number of legacy applications that need to be ported to multiple heterogeneous architectures makes it necessary to improve the process of transforming existing applications to new programming models. Parallel patterns have been in use since the 90s [4]. They have emerged as a way of expressing parallelism in existing sequential applications, providing a way of raising the abstraction level and making it possible to ensure a proper separation of concerns between the application semantics and technical implementation details. Many algorithms match a parallel pattern approach, and patterns are easily exploitable by heterogeneous parallel architectures [5]. With the emergence of heterogeneous platforms, patterns have been shown to be an excellent way to express algorithms that can then been mapped to multiple architectures, so reducing the software development effort.

José Daniel García Sánchez | Kevin Hammond | Lutz Schubert

[1] Peter Kilpatrick,et al. A parallel pattern for iterative stencil + reduce , 2016, The Journal of Supercomputing.

[2] Arch D. Robison,et al. Structured Parallel Programming: Patterns for Efficient Computation , 2012 .

[3] Murray Cole,et al. Parallel Skeletons , 2011, Encyclopedia of Parallel Computing.

[4] Marco Danelutto,et al. Data stream processing via code annotations , 2016, The Journal of Supercomputing.

[5] Christina Freytag,et al. Using Mpi Portable Parallel Programming With The Message Passing Interface , 2016 .

[6] William Gropp,et al. Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[7] Christoph W. Kessler,et al. MeterPU: a generic measurement abstraction API , 2015, 2015 IEEE Trustcom/BigDataSE/ISPA.

[8] Luis Miguel Sánchez,et al. Assessing and discovering parallelism in C$$++$$++ code for heterogeneous platforms , 2016, The Journal of Supercomputing.

[9] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .

[10] Pavan Balaji,et al. Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models , 2016, The Journal of Supercomputing.