Implementing Fusion-Equipped Parallel Skeletons by Expression Templates

Developing efficient parallel programs is more difficult and complicated than developing sequential ones. Skeletal parallelism is a promising methodology for easy parallel programming in which users develop parallel programs by composing ready-made components called parallel skeletons. We developed a parallel skeleton library SkeTo that provides parallel skeletons implemented in C++ andMPI for distributedmemory environments. In the new version of the library, the implementation of the parallel skeletons for lists is improved so that the skeletons equip themselves with fusion optimization. The optimization mechanism is implemented based on the programming technique called expression templates. In this paper, we illustrate the improved design and implementation of parallel skeletons for lists in the SkeTo library.

[1]  Stephen Gilmore,et al.  Flexible Skeletal Programming with eSkel , 2005, Euro-Par.

[2]  Peter Sanders,et al.  MCSTL: The Multi-core Standard Template Library , 2007, Euro-Par.

[3]  Masato Takeichi,et al.  Domain-Specific Optimization Strategy for Skeleton Programs , 2007, Euro-Par.

[4]  Shigeru Chiba,et al.  A metaobject protocol for C++ , 1995, OOPSLA.

[5]  D J Evans,et al.  Parallel processing , 1986 .

[6]  Kiminori Matsuzaki,et al.  Efficient Implementation of Tree Accumulations on Distributed-Memory Parallel Computers , 2007, International Conference on Computational Science.

[7]  Manfred Broy,et al.  Logic of Programming and Calculi of Discrete Design , 1987, NATO ASI Series.

[8]  Werner Kluge,et al.  Implementation of Functional Languages , 1996, Lecture Notes in Computer Science.

[9]  Takahiro Katagiri,et al.  Solving the 24-queens Problem using MPI on a PC Cluster , 2004 .

[10]  James Reinders,et al.  Intel threading building blocks - outfitting C++ for multi-core processor parallelism , 2007 .

[11]  Anne-Marie Kermarrec,et al.  Proceedings of the 13th European international conference on Parallel Processing , 2007 .

[12]  Masato Takeichi,et al.  Diffusion: Calculating Efficient Parallel Programs , 1999, PEPM.

[13]  Sven-Bodo Scholz,et al.  WITH-Loop-Folding in SAC - Condensing Consecutive Array Operations , 1997, Implementation of Functional Languages.

[14]  Jean-Thierry Lapresté,et al.  Meta-programming Applied to Automatic SMP Parallelization of Linear Algebra Code , 2008, Euro-Par.

[15]  Todd L. Veldhuizen,et al.  Expression templates , 1996 .

[16]  Sergei Gorlatch,et al.  TOWARDS PARALLEL PROGRAMMING BY TRANSFORMATION: THE FAN SKELETON FRAMEWORK , 2001, Parallel Algorithms Appl..

[17]  Rita Loogen,et al.  Automatic Skeletons in Template Haskell , 2003, Parallel Process. Lett..

[18]  Todd L. Veldhuizen,et al.  Arrays in Blitz++ , 1998, ISCOPE.

[19]  Patrizio Dazzi,et al.  Scalable Computing: Practice and Experience WSSP, Warsaw, Poland, 2007. To appear. MUSKEL: AN EXPANDABLE SKELETON ENVIRONMENT∗ , 2007 .

[20]  Rita Loogen,et al.  Implementation Skeletons in Eden: Low-Effort Parallel Programming , 2000, IFL.

[21]  Richard S. Bird,et al.  An introduction to the theory of lists , 1987 .

[22]  Marco Danelutto,et al.  Euro-Par 2004 Parallel Processing , 2004, Lecture Notes in Computer Science.

[23]  Emilio Luque,et al.  Euro-Par 2008 - Parallel Processing, 14th International Euro-Par Conference, Las Palmas de Gran Canaria, Spain, August 26-29, 2008, Proceedings , 2008, Euro-Par.

[24]  Martin Odersky,et al.  Domain-Specific Program Generation , 2004, Lecture Notes in Computer Science.

[25]  Zhenjiang Hu,et al.  A library of constructive skeletons for sequential style of parallel programming , 2006, InfoScale '06.

[26]  Masato Takeichi,et al.  An Accumulative Parallel Skeleton for All , 2002, APLAS.

[27]  Herbert Kuchen,et al.  A Skeleton Library , 2002, Euro-Par.

[28]  Brian Campbell,et al.  Amortised Memory Analysis Using the Depth of Data Structures , 2009, ESOP.

[29]  Sven-Bodo Scholz,et al.  Single Assignment C: efficient support for high-level array operations in a functional setting , 2003, Journal of Functional Programming.

[30]  Hideya Iwasaki,et al.  A Parallel Skeleton Library for Multi-core Clusters , 2009, 2009 International Conference on Parallel Processing.

[31]  Murray Cole,et al.  Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .

[32]  Denis Caromel,et al.  Computing in Object-Oriented Parallel Environments , 2002, Lecture Notes in Computer Science.

[33]  Zhenjiang Hu,et al.  A Fusion-Embedded Skeleton Library , 2004, Euro-Par.

[34]  Masato Takeichi,et al.  A Compositional Framework for Developing Parallel Programs on Two-Dimensional Arrays , 2007, International Journal of Parallel Programming.

[35]  Sergei Gorlatch,et al.  Generic Parallel Programming Using C++ Templates and Skeletons , 2003, Domain-Specific Program Generation.

[36]  Susumu Horiguchi,et al.  A parallel SML compiler based on algorithmic skeletons , 2005, Journal of Functional Programming.

[37]  Jack Dongarra,et al.  Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27 - 30, 2007, Proceedings, Part III , 2007, ICCS.