Elements for Parallel Array Computations

Parallel high performance array programming in functional languages is usually performed with a fixed set of carefully chosen high-level combinators. Problems that fall outside the design scope for the combinators usually result in programs that are too slow and programmers resort to implementing them in low-level imperative languages without safety guarantees. We present element sets that allow us to keep the safety of functional high-level combinators but relaxes the specification order for array elements. The cost of the flexibility is two proof obligations for the compiler or library writer and the return is safety guarantees with zero performance overhead for all programs. We implemented element sets in a compiler and measured to see that that scalability and performance remained competitive with our solution.

[1]  Franz Franchetti,et al.  SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.

[2]  Ryan Newton,et al.  Embrace, defend, extend: a methodology for embedding preexisting DSLs , 2013, FPCDSL '13.

[3]  Davide Sangiorgi,et al.  Logical bisimulations and functional languages , 2007, FSEN'07.

[4]  Frédo Durand,et al.  Decoupling algorithms from schedules for easy optimization of image processing pipelines , 2012, ACM Trans. Graph..

[5]  Bo Joel Svensson,et al.  Obsidian: A Domain Specific Embedded Language for Parallel Programming of Graphics Processors , 2008, IFL.

[6]  Manuel M. T. Chakravarty,et al.  Accelerating Haskell array codes with multicore GPUs , 2011, DAMP '11.

[7]  Simon L. Peyton Jones,et al.  Regular, shape-polymorphic, parallel arrays in Haskell , 2010, ICFP '10.

[8]  Mary Sheeran,et al.  The Design and Implementation of Feldspar - An Embedded Language for Digital Signal Processing , 2010, IFL.

[9]  Geoffrey Mainland,et al.  Nikola: embedding compiled GPU functions in Haskell , 2010 .

[10]  Gabriele Keller,et al.  Efficient parallel stencil convolution in Haskell , 2012 .

[11]  Markus Püschel,et al.  A Basic Linear Algebra Compiler , 2014, CGO '14.

[12]  Gordon L. Kindlmann,et al.  Diderot: a parallel DSL for image analysis and visualization , 2012, PLDI.

[13]  Karl-Filip Faxén Efficient Work Stealing for Fine Grained Parallelism , 2010, 2010 39th International Conference on Parallel Processing.

[14]  Josef Svenningsson,et al.  An EDSL approach to high performance Haskell programming , 2013, Haskell '13.

[15]  David Sands From SOS rules to proof principles: an operational metatheory for functional languages , 1997, POPL '97.

[16]  Simon Peyton Jones,et al.  Guiding parallel array fusion with indexed types , 2013, Haskell 2013.

[17]  Troels Henriksen,et al.  A T2 graph-reduction approach to fusion , 2013, FHPC '13.

[18]  Frédo Durand,et al.  Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI 2013.

[19]  Bo Joel Svensson,et al.  Expressive array constructs in an embedded GPU kernel programming language , 2012, DAMP '12.

[20]  Hai Liu,et al.  The Intel labs Haskell research compiler , 2013, Haskell '13.

[21]  Alan Mycroft,et al.  Ypnos: declarative, parallel structured grid programming , 2010, DAMP '10.