A case study of the task-based parallel wavefront pattern

This paper analyzes the applicability of the task-programming model to the parallelization of the wavefront pattern. Computations for this type of problem are characterized by a data dependency pattern across a data space. This pattern can produce a variable number of independent tasks through traversing this space. Different implementations of this pattern are studied based on the current state-of-theart threading frameworks that support tasks. For each implementation, the specific issues are discussed from a programmer’s point of view, highlighting any advantageous features in each case. In addition, several experiments are carried out, and the factors that can limit performance in each implementation are identified. Moreover, some optimizations that the programmer can exploit to reduce overheads (task recycling, prioritization of tasks based on locality hints and tiling) are proposed and

[1]  Dave Strenski,et al.  Exploring Accelerating Science Applications with FPGAs , 2007 .

[2]  Andrew Lumsdaine,et al.  PFunc: modern task parallelism for modern high performance computing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[3]  Wu-chun Feng,et al.  Cell-SWat: modeling and scheduling wavefront computations on the cell broadband engine , 2008, CF '08.

[4]  Lawrence Snyder,et al.  Pipelining Wavefront Computations: Experiences and Performance , 2000, IPDPS Workshops.

[5]  Li Yi,et al.  Harnessing parallelism in multicore clusters with the all-pairs and wavefront abstractions , 2009, HPDC '09.

[6]  James Reinders,et al.  Intel® threading building blocks , 2008 .

[7]  Jonathan Schaeffer,et al.  Generating parallel programs from the wavefront design pattern , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[8]  Jean-Thierry Lapresté,et al.  Quaff: efficient C++ design for parallel skeletons , 2006, Parallel Comput..

[9]  Rafael Asenjo,et al.  High-level template for the task-based parallel wavefront pattern , 2011, 2011 18th International Conference on High Performance Computing.

[10]  Gordon Clapworthy,et al.  Wavefront raycasting using larger filter kernels for on-the-fly GPU gradient reconstruction , 2010, The Visual Computer.

[11]  Barbara Chapman,et al.  Using OpenMP - portable shared memory parallel programming , 2007, Scientific and engineering computation.