Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor

Efficiently using the hardware capabilities of the Cell processor, a heterogeneous chip multiprocessor that uses several levels of parallelism to deliver high performance, and being able to reuse legacy code are real challenges for application developers. We propose to use Generative Programming and more precisely template meta-programming to design an Domain Specific Embedded Language using algorithmic skeletons to generate applications based on a high-level mapping description. The method is easy to use by developers and delivers performance close to the performance of optimized hand-written code, as shown on various benchmarks ranging from simple BLAS kernels to image processing applications.

[1]  Yoshihiko Futamura,et al.  Partial Evaluation of Computation Process--An Approach to a Compiler-Compiler , 1999, High. Order Symb. Comput..

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  M. Suzuoki,et al.  Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor , 2006, IEEE Journal of Solid-State Circuits.

[4]  Ulrich W. Eisenecker Generative Programming (GP) with C++ , 1997, JMLC.

[5]  Rob van Nieuwpoort,et al.  Radioastronomy Image Synthesis on the Cell/B.E , 2008, Euro-Par.

[6]  Salvatore Orlando,et al.  P3 L: A structured high-level parallel language, and its structured support , 1995, Concurr. Pract. Exp..

[7]  Todd L. Veldhuizen,et al.  Using C++ template metaprograms , 1996 .

[8]  Simon L. Peyton Jones,et al.  Template meta-programming for Haskell , 2002, Haskell '02.

[9]  Claude Tadonki,et al.  Parallelization Schemes for Memory Optimization on the Cell Processor: A Case Study on the Harris Corner Detector , 2011, Trans. High Perform. Embed. Archit. Compil..

[10]  Jean-Thierry Lapresté,et al.  Quaff: efficient C++ design for parallel skeletons , 2006, Parallel Comput..

[11]  Takeo Kanade,et al.  High Performance Embedded Architectures and Compilers , 2009, Lecture Notes in Computer Science.

[12]  Paul Hudak,et al.  Modular domain specific languages and tools , 1998, Proceedings. Fifth International Conference on Software Reuse (Cat. No.98TB100203).

[13]  Herbert Kuchen,et al.  A Skeleton Library , 2002, Euro-Par.

[14]  Murray Cole,et al.  Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming , 2004, Parallel Comput..

[15]  I. Wald,et al.  Ray Tracing on the Cell Processor , 2006, 2006 IEEE Symposium on Interactive Ray Tracing.

[16]  Christoph W. Kessler,et al.  BlockLib: a skeleton library for cell broadband engine , 2008, IWMSE '08.

[17]  Jocelyn Sérot,et al.  Formal Semantics Applied to the Implementation of a Skeleton-Based Parallel Programming Library , 2007, PARCO.

[18]  Jocelyn Sérot,et al.  Skeletons for parallel image processing: an overview of the SKIPPER project , 2002, Parallel Comput..

[19]  Todd L. Veldhuizen,et al.  Expression templates , 1996 .

[20]  Toshio Nakatani,et al.  MPI microtask for programming the Cell Broadband EngineTM processor , 2006, IBM Syst. J..

[21]  Michael Gschwind,et al.  Using advanced compiler technology to exploit the performance of the Cell Broadband EngineTM architecture , 2006, IBM Syst. J..

[22]  P. Hanrahan,et al.  Sequoia: Programming the Memory Hierarchy , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[23]  M. McCool Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform , 2006 .

[24]  Todd L. Veldhuizen C++ Templates as Partial Evaluation , 1999, PEPM.

[25]  Berna L. Massingill Patterns for Parallel Application Programs , 1999 .

[26]  Todd L. Veldhuizen Just When You Thought Your Little Language Was Safe: "Expression Templates" in Java , 2000, GCSE.

[27]  Dirk Draheim,et al.  Generative programming for C# , 2005, SIGP.

[28]  Tobias Langhammer,et al.  Combining partial evaluation and staged interpretation in the implementation of domain-specific languages , 2006, Sci. Comput. Program..

[29]  Joel Falcou High-Level Parallel Programming EDSL A BOOST libraries use case , 2009 .