Functional parallels of sequential imperatives (short paper)

Symbolic parallelism is a fresh look at the decade-old problem of turning sequential, imperative, code into associative reduction kernels, based on the realization that map/reduce is at its core a staging problem: how can programs be separated so that part of the computation can be done before loop-carried dependencies become available? Previous work has investigated dynamic approaches that build symbolic summaries while the actual data is processed. In this short paper, we approach the problem from the static side, and show that with simple syntax- or type-driven transformations, we can readily transform large classes of imperative groupby-aggregate programs into map/reduce parallelism with deterministic overhead.

[1]  Kunle Olukotun,et al.  Optimizing data structures in high-level programs: new directions for extensible compilers based on staging , 2013, POPL.

[2]  Guy L. Steele,et al.  Growing a Language , 1999, High. Order Symb. Comput..

[3]  Walid Taha,et al.  MetaML and multi-stage programming with explicit annotations , 2000, Theor. Comput. Sci..

[4]  Allan L. Fisher,et al.  Parallelizing complex scans and reductions , 1994, PLDI '94.

[5]  Albert Cohen,et al.  The Polyhedral Model Is More Widely Applicable Than You Think , 2010, CC.

[6]  Kunle Olukotun,et al.  Implementing Domain-Specific Languages for Heterogeneous Parallel Computing , 2011, IEEE Micro.

[7]  Paul Hudak,et al.  Modular domain specific languages and tools , 1998, Proceedings. Fifth International Conference on Software Reuse (Cat. No.98TB100203).

[8]  OlukotunKunle,et al.  Optimizing data structures in high-level programs , 2013 .

[9]  John C. Reynolds User-defined types and procedural data structures as complementary approaches to data abstraction , 1994 .

[10]  Markus Mock,et al.  DyC: an expressive annotation-directed dynamic compiler for C , 2000, Theor. Comput. Sci..

[11]  Jan Vitek,et al.  Terra: a multi-stage language for high-performance computing , 2013, PLDI.

[12]  William R. Cook,et al.  Hybrid partial evaluation , 2011, OOPSLA '11.

[13]  James Cheney,et al.  Edinburgh Research Explorer A Practical Theory of Language-integrated Query , 2022 .

[14]  Bo Joel Svensson,et al.  Design Exploration through Code-generating DSLs , 2014, ACM Queue.

[15]  Siau-Cheng Khoo,et al.  PType System: A Featherweight Parallelizability Detector , 2004, APLAS.

[16]  Wolfram Schulte,et al.  Data-parallel finite-state machines , 2014, ASPLOS.

[17]  Jens Palsberg,et al.  Eta-expansion does The Trick , 1995, TOPL.

[18]  Nathaniel Nystrom,et al.  Firepile: run-time compilation for GPUs in scala , 2011, GPCE '11.

[19]  Paul H. J. Kelly,et al.  Runtime Code Generation in C++ as a Foundation for Domain-Specific Optimisation , 2003, Domain-Specific Program Generation.

[20]  Trevor L. McDonell Optimising purely functional GPU programs , 2013, ICFP.

[21]  Jacques Carette,et al.  Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages , 2007, Journal of Functional Programming.

[22]  Daan Leijen,et al.  Domain specific embedded compilers , 1999, DSL '99.

[23]  Kunle Olukotun,et al.  Composition and Reuse with Compiled Domain-Specific Languages , 2013, ECOOP.

[24]  Hideya Iwasaki,et al.  Automatic parallelization via matrix multiplication , 2011, PLDI '11.

[25]  Peter Thiemann Partially static operations , 2013, PEPM '13.

[26]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[27]  Yuefan Deng,et al.  New trends in high performance computing , 2001, Parallel Computing.

[28]  P. J. Landin,et al.  The next 700 programming languages , 1966, CACM.

[29]  Andrei V. Klimov,et al.  A Java Supercompiler and Its Application to Verification of Cache-Coherence Protocols , 2009, Ershov Memorial Conference.

[30]  Zhenjiang Hu,et al.  Filter-embedding semiring fusion for programming with MapReduce , 2012, Formal Aspects of Computing.

[31]  José M. F. Moura,et al.  Spiral: A Generator for Platform-Adapted Libraries of Signal Processing Alogorithms , 2004, Int. J. High Perform. Comput. Appl..

[32]  Christian Hofer,et al.  Polymorphic embedding of dsls , 2008, GPCE '08.

[33]  Martin Odersky,et al.  Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs , 2010, GPCE '10.

[34]  Tiark Rompf,et al.  Lightweight Modular Staging and Embedded Compilers - Abstraction without Regret for High-Level High-Performance Programming , 2012 .

[35]  Sam Tobin-Hochstadt,et al.  Languages as libraries , 2011, PLDI '11.

[36]  J. Gregory Morrisett,et al.  Nikola: embedding compiled GPU functions in Haskell , 2010, Haskell '10.

[37]  Matthias Felleisen,et al.  Linguistic reuse , 2001 .

[38]  Ulrik Pagh Schultz,et al.  Automatic program specialization for Java , 2000, TOPL.

[39]  Todd Mytkowicz,et al.  Parallelizing user-defined aggregations using symbolic execution , 2015, SOSP.

[40]  Andy Gill,et al.  Domain-specific languages and code synthesis using Haskell , 2014, CACM.

[41]  Peter Sestoft,et al.  Partial evaluation and automatic program generation , 1993, Prentice Hall international series in computer science.

[42]  Kunle Olukotun,et al.  A Heterogeneous Parallel Framework for Domain-Specific Languages , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[43]  Matteo Frigo,et al.  A fast Fourier transform compiler , 1999, SIGP.

[44]  Walid Taha,et al.  Implementing Multi-stage Languages Using ASTs, Gensym, and Reflection , 2003, GPCE.

[45]  Sergei Gorlatch,et al.  Extracting and Implementing List Homomorphisms in Parallel Program Development , 1999, Sci. Comput. Program..

[46]  Pat Hanrahan,et al.  First-class runtime generation of high-performance types using exotypes , 2014, PLDI.

[47]  Paul Hudak,et al.  Building domain-specific embedded languages , 1996, CSUR.

[48]  Oege de Moor,et al.  Compiling embedded languages , 2000, Journal of Functional Programming.

[49]  W. Daniel Hillis,et al.  Data parallel algorithms , 1986, CACM.

[50]  Kunle Olukotun,et al.  Building-Blocks for Performance Oriented DSLs , 2011, DSL.

[51]  Valentin F. Turchin,et al.  The concept of a supercompiler , 1986, TOPL.

[52]  Philip Wadler,et al.  Everything old is new again: quoted domain-specific languages , 2015, PEPM.

[53]  Kurt Keutzer,et al.  Copperhead: compiling an embedded data parallel language , 2011, PPoPP '11.

[54]  Kunle Olukotun,et al.  Language virtualization for heterogeneous parallel computing , 2010, OOPSLA.