Implicitly-threaded parallelism in Manticore

The increasing availability of commodity multicore processors is making parallel computing available to the masses. Traditional parallel languages are largely intended for large-scale scientific computing and tend not to be well-suited to programming the applications one typically finds on a desktop system. Thus we need new parallel-language designs that address a broader spectrum of applications. In this paper, we present Manticore, a language for building parallel applications on commodity multicore hardware including a diverse collection of parallel constructs for different granularities of work. We focus on the implicitly-threaded parallel constructs in our high-level functional language. We concentrate on those elements that distinguish our design from related ones, namely, a novel parallel binding form, a nondeterministic parallel case form, and exceptions in the presence of data parallelism. These features differentiate the present work from related work on functional data parallel language designs, which has focused largely on parallel problems with regular structure and the compiler transformations --- most notably, flattening --- that make such designs feasible. We describe our implementation strategies and present some detailed examples utilizing various mechanisms of our language.

[1]  Paul Hudak,et al.  First-class monadic schedules , 2004, TOPL.

[2]  Guy E. Blelloch,et al.  Implementation of a portable nested data-parallel language , 1993, PPOPP '93.

[3]  David Luckham,et al.  Programming with Specifications , 1990, Texts and Monographs in Computer Science.

[4]  Robert H. Halstead,et al.  Implementation of multilisp: Lisp on a multiprocessor , 1984, LFP '84.

[5]  Joe Armstrong,et al.  Concurrent programming in ERLANG , 1993 .

[6]  Arvind,et al.  Implicit parallel programming in pH , 2001 .

[7]  Guy E. Blelloch,et al.  Beyond nested parallelism: tight bounds on work-stealing overheads for parallel futures , 2009, SPAA '09.

[8]  Manuel M. T. Chakravarty,et al.  Higher Order Flattening , 2006, International Conference on Computational Science.

[9]  Hans-Juergen Boehm,et al.  Ropes: An alternative to strings , 1995, Softw. Pract. Exp..

[10]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[11]  C. Greg Plaxton,et al.  Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA.

[12]  Guy E. Blelloch,et al.  A provable time and space efficient implementation of NESL , 1996, ICFP '96.

[13]  Simon L. Peyton Jones,et al.  Composable memory transactions , 2005, CACM.

[14]  John H. Reppy,et al.  Concurrent programming in ML , 1999 .

[15]  Kwangkeun Yi An Abstract Interpretation for Estimating Uncaught Exceptions in Standard ML Programs , 1998, Sci. Comput. Program..

[16]  François Pessaux,et al.  Type-based analysis of uncaught exceptions , 2000, TOPL.

[17]  Arvind,et al.  M-Structures: Extending a Parallel, Non-strict, Functional Language with State , 1991, FPCA.

[18]  Jean-Luc Gaudiot,et al.  The Sisal model of functional programming and its implementation , 1997, Proceedings of IEEE International Symposium on Parallel Algorithms Architecture Synthesis.

[19]  PikeRob,et al.  Interpreting the data , 2005 .

[20]  John H. Reppy,et al.  CML: A Higher-Order Concurrent Language , 1991, PLDI.

[21]  Emden R. Gansner,et al.  The standard ML basis library , 2002 .

[22]  Hans-Wolfgang Loidl,et al.  Algorithm + strategy = parallelism , 1998, Journal of Functional Programming.

[23]  Simon L. Peyton Jones,et al.  Concurrent Haskell , 1996, POPL '96.

[24]  Kevin Hammond Parallel SML: A functional language and its implementation in Dactl , 1991 .

[25]  Manuel M. T. Chakravarty,et al.  More types for nested data parallel programming , 2000, ICFP '00.

[26]  Luc Maranget,et al.  Compiling Join-Patterns , 1998, Electron. Notes Theor. Comput. Sci..

[27]  Marvin Theimer,et al.  Using threads in interactive systems: a case study , 1993, SOSP '93.

[28]  Simon L. Peyton Jones,et al.  A semantics for imprecise exceptions , 1999, PLDI '99.

[29]  John H. Reppy,et al.  CML: A higher concurrent language , 1991, PLDI '91.

[30]  Claudio V. Russo,et al.  Parallel concurrent ML , 2009, ICFP.

[31]  Charles E. Leiserson,et al.  Programming with exceptions in JCilk , 2006, Sci. Comput. Program..

[32]  C. Greg Plaxton,et al.  Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA '98.

[33]  Simon L. Peyton Jones,et al.  Composable memory transactions , 2008, Commun. ACM.

[34]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[35]  John H. Reppy,et al.  Manticore: a heterogeneous parallel language , 2007, DAMP '07.

[36]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[37]  Manuel M. T. Chakravarty,et al.  Nepal - Nested Data Parallelism in Haskell , 2001, Euro-Par.

[38]  Randy B. Osborne Speculative computation in multilisp , 1990, LISP and Functional Programming.

[39]  Alexandros Tzannes,et al.  Lazy binary-splitting: a run-time adaptive work-stealing scheduler , 2010, PPoPP '10.

[40]  John McCarthy,et al.  A BASIS FOR A MATHEMATICAL THEORY OF COMPUTATION 1) , 2018 .

[41]  Claes Wikström,et al.  Concurrent programming in ERLANG (2nd ed.) , 1996 .

[42]  Paul Hudak,et al.  First-class schedules and virtual maps , 1995, FPCA '95.

[43]  Guy E. Blelloch,et al.  The data locality of work stealing , 2000, SPAA.

[44]  T. von Eicken,et al.  Parallel programming in Split-C , 1993, Supercomputing '93.

[45]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, IEEE Trans. Parallel Distributed Syst..

[46]  John H. Reppy,et al.  Status report: the manticore project , 2007, ML '07.

[47]  Guy E. Blelloch,et al.  Programming parallel algorithms , 1996, CACM.

[48]  Simon L. Peyton Jones,et al.  Data parallel Haskell: a status report , 2007, DAMP '07.

[49]  Pekka Hedqvist A Parallel and Multithreaded ERLANG Implementation , 1998 .

[50]  David Tarditi,et al.  Accelerator: using data parallelism to program GPUs for general-purpose uses , 2006, ASPLOS XII.

[51]  David Lorge Parnas,et al.  A technique for software module specification with examples , 1972, CACM.

[52]  John H. Reppy,et al.  A scheduling framework for general-purpose parallel languages , 2008, ICFP 2008.

[53]  Keshav Pingali,et al.  I-structures: data structures for parallel computing , 1986, Graph Reduction.

[54]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[55]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[56]  David B. MacQueen,et al.  The Definition of Standard ML (Revised) , 1997 .