Vectorisation avoidance

Flattening nested parallelism is a vectorising code transform that converts irregular nested parallelism into flat data parallelism. Although the result has good asymptotic performance, flattening thoroughly restructures the code. Many intermediate data structures and traversals are introduced, which may or may not be eliminated by subsequent optimisation. We present a novel program analysis to identify parts of the program where flattening would only introduce overhead, without appropriate gain. We present empirical evidence that avoiding vectorisation in these cases leads to more efficient programs than if we had applied vectorisation and then relied on array fusion to eliminate intermediates from the resulting code.

[1]  Guy E. Blelloch,et al.  Space profiling for parallel functional programs , 2010, Journal of Functional Programming.

[2]  Simon L. Peyton Jones,et al.  System F with type equality coercions , 2007, TLDI '07.

[3]  Manuel M. T. Chakravarty,et al.  Higher Order Flattening , 2006, International Conference on Computational Science.

[4]  Lars Bergstrom,et al.  Nested data-parallelism on the gpu , 2012, ICFP 2012.

[5]  Simon L. Peyton Jones,et al.  Associated types with class , 2005, POPL '05.

[6]  Simon L. Peyton Jones,et al.  Associated type synonyms , 2005, ICFP '05.

[7]  Simon L. Peyton Jones Harnessing the Multicores: Nested Data Parallelism in Haskell , 2008, APLAS.

[8]  Guy E. Blelloch,et al.  Compiling Collection-Oriented Languages onto Massively Parallel Computers , 1990, J. Parallel Distributed Comput..

[9]  Guy E. Blelloch,et al.  Provably efficient scheduling for languages with fine-grained parallelism , 1999, JACM.

[10]  Andrew W. Appel,et al.  Continuation-passing, closure-passing style , 1989, POPL '89.

[11]  Lars Bergstrom,et al.  Lazy tree splitting , 2010, ICFP '10.

[12]  Manuel M. T. Chakravarty,et al.  Flattening Trees , 1998, Euro-Par.

[13]  Simon L. Peyton Jones,et al.  Harnessing the Multicores: Nested Data Parallelism in Haskell , 2008, FSTTCS.

[14]  Manuel M. T. Chakravarty,et al.  More types for nested data parallel programming , 2000, ICFP '00.

[15]  John H. Reppy,et al.  Implementation techniques for nested-data-parallel languages , 2011 .

[16]  Simon L. Peyton Jones,et al.  Work efficient higher-order vectorisation , 2012, ICFP '12.

[17]  Roman Leshchinskiy,et al.  Stream fusion: from lists to streams to nothing at all , 2007, ICFP '07.

[18]  Manuel M. T. Chakravarty,et al.  Functional array fusion , 2001, ICFP '01.

[19]  Simon L. Peyton Jones,et al.  Data parallel Haskell: a status report , 2007, DAMP '07.

[20]  Roman Leshchinskiy,et al.  Rewriting Haskell Strings , 2007, PADL.

[21]  Simon Peyton Jones,et al.  Partial Vectorisation of Haskell Programs , 2008 .