Programming Heterogeneous Multicore Systems Using Threading Building Blocks

Intel's Threading Building Blocks (TBB) provide a high-level abstraction for expressing parallelism in applications without writing explicitly multi-threaded code. However, TBB is only available for shared-memory, homogeneous multicore processors. Codeplay's Offload C++ provides a single-source, POSIX threads-like approach to programming heterogeneous multicore devices where cores are equipped with private, local memories--code to move data between memory spaces is generated automatically. In this paper, we show that the strengths of TBB and Offload C++ can be combined, by implementing part of the TBB headers in Offload C++. This allows applications parallelised using TBB to run, without source-level modifications, across all the cores of the Cell BE processor. We present experimental results applying our method to a set of TBB programs. To our knowledge, this work marks the first demonstration of programs parallelised using TBB executing on a heterogeneous multicore architecture.

[1]  Bjarne Stroustrup,et al.  The Design and Evolution of C , 1994 .

[2]  Tao Zhang,et al.  Supporting OpenMP on Cell , 2007, IWOMP.

[3]  H. Peter Hofstee,et al.  Power efficient processor architecture and the cell processor , 2005, 11th International Symposium on High-Performance Computer Architecture.

[4]  Andrew Richards,et al.  Offload - Automating Code Migration to Heterogeneous Multicore Systems , 2010, HiPEAC.

[5]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[6]  Kathryn M. O'Brien,et al.  Supporting OpenMP on cell , 2008 .

[7]  Aaftab Munshi,et al.  The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).