Yada: Straightforward parallel programming

Now that multicore chips are common, providing an approach to parallel programming that is usable by regular programmers has become even more important. This cloud has one silver lining: providing useful speedup on a program is useful in and of itself, even if the resulting performance is lower than the best possible parallel performance on the same program. To help achieve this goal, Yada is an explicitly parallel programming language with sequential semantics. Explicitly parallel, because we believe that programmers need to identify how and where to exploit potential parallelism, but sequential semantics so that programmers can understand and debug their parallel programs in the way that they already know, i.e. as if they were sequential. The key new idea in Yada is the provision of a set of types that support parallel operations while still preserving sequential semantics. Beyond the natural read-sharing found in most previous sequential-like languages, Yada supports three other kinds of sharing. Writeonce locations support a single write and multiple reads, and two kinds of sharing for locations updated with an associative operator generalise the reduction and parallel-prefix operations found in many data-parallel languages. We expect to support other kinds of sharing in the future. We have evaluated our Yada prototype on eight algorithms and four applications, and found that programs require only a few changes to get useful speedups ranging from 2.2 to 6.3 on an 8-core machine. Yada performance is mostly comparable to parallel implementations of the same programs using OpenMP or explicit threads.

[1]  Stephen N. Freund,et al.  SingleTrack: A Dynamic Determinism Checker for Multithreaded Programs , 2009, ESOP.

[2]  Andrea C. Arpaci-Dusseau,et al.  Parallel programming in Split-C , 1993, Supercomputing '93. Proceedings.

[3]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[4]  Easwaran Raman,et al.  Speculative Decoupled Software Pipelining , 2007, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007).

[5]  Gilles Kahn,et al.  The Semantics of a Simple Language for Parallel Programming , 1974, IFIP Congress.

[6]  Brandon Lucia,et al.  DMP: Deterministic Shared-Memory Multiprocessing , 2010, IEEE Micro.

[7]  Keshav Pingali,et al.  Optimistic parallelism requires abstractions , 2007, PLDI '07.

[8]  Anwar Ghuloum Ct: channelling NeSL and SISAL in C++ , 2007, CUFP '07.

[9]  James R. Larus,et al.  LCM: memory system support for parallel language implementation , 1994, ASPLOS VI.

[10]  Koushik Sen,et al.  Asserting and checking determinism for multithreaded programs , 2009, ESEC/FSE '09.

[11]  K. Mani Chandy,et al.  A Notation for Deterministic Cooperating Processes , 1995, IEEE Trans. Parallel Distributed Syst..

[12]  Charles E. Leiserson,et al.  Efficient Detection of Determinacy Races in Cilk Programs , 1997, SPAA '97.

[13]  Guy E. Blelloch,et al.  Implementation of a portable nested data-parallel language , 1993, PPOPP '93.

[14]  Martin C. Rinard,et al.  Commutativity analysis: a new analysis technique for parallelizing compilers , 1997, TOPL.

[15]  Guy E. Blelloch,et al.  Prefix sums and their applications , 1990 .

[16]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[17]  David I. August,et al.  Decoupled software pipelining with the synchronization array , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[18]  Simon L. Peyton Jones,et al.  Data parallel Haskell: a status report , 2007, DAMP '07.

[19]  John Glauert,et al.  SISAL: streams and iteration in a single-assignment language. Language reference manual, Version 1. 1 , 1983 .

[20]  Seyed H. Roosta Principles of Parallel Programming , 2000 .

[21]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[22]  Bradford L. Chamberlain The design and implementation of a region-based parallel language , 2001 .

[23]  John Glauert,et al.  SISAL: streams and iteration in a single assignment language. Language reference manual, Version 1. 2. Revision 1 , 1985 .

[24]  Gordon D. Plotkin,et al.  A structural approach to operational semantics , 2004, J. Log. Algebraic Methods Program..

[25]  Josep Torrellas,et al.  Capo: a software-hardware interface for practical deterministic multiprocessor replay , 2009, ASPLOS.

[26]  Bradford L. Chamberlain,et al.  The design and implementation of a region-based parallel programming language , 2001 .

[27]  Gurindar S. Sohi,et al.  Serialization sets: a dynamic dependence-based parallel execution model , 2009, PPoPP '09.

[28]  Monica S. Lam,et al.  Jade: a high-level, machine-independent language for parallel programming , 1993, Computer.

[29]  Laxmikant V. Kalé,et al.  MSA: Multiphase Specifically Shared Arrays , 2004, LCPC.

[30]  Marek Olszewski,et al.  Kendo: efficient deterministic multithreading in software , 2009, ASPLOS.

[31]  Alexander Aiken,et al.  A capability calculus for concurrency and determinism , 2006, TOPL.

[32]  Jeffrey Overbey,et al.  A type and effect system for deterministic parallel Java , 2009, OOPSLA '09.

[33]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[34]  Amey Karkare,et al.  Heap reference analysis using access graphs , 2006, ACM Trans. Program. Lang. Syst..

[35]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.