A provably time-efficient parallel implementation of full speculation

Speculative evaluation, including leniency and futures, is often used to produce high degrees of parallelism. Understanding the performance characteristics of such evaluation, however, requires having a detailed understanding of the implementation. For example, the particular implementaion technique used to suspend and reactivate threads can have an asymptotic effect on performance. With the goal of giving the users some understanding of performance without requiring them to understand the implementation, we present a provable implementation bound for a language based on speculative evaluation. The idea is (1) to supply the users with a semantics for a language that defines abstract costs for measuring or analyzing the performance of computations, (2) to supply the users with a mapping of these costs onto runtimes on various machine models, and (3) to describe an implementation strategy of the language and prove that it meets these mappings. For this purpose we consider a simple language based on speculative evaluation. For every computation, the semantics of the language returns a directed acyclic graph (DAG) in which each node represents a unit of computation, and each edge represents a dependence. We then describe an implementation strategy of the language and show that any computation with <italic>w</italic> work (the number of nodes in the DAG) and <italic>d</italic> depth (the length of the longest path in the DAG) will run on a <italic>p</italic>-processor PRAM in <italic>O</italic>(<italic>w</italic>/<italic>p</italic> + <italic>d</italic> log <italic>p</italic>) time. The bounds are work efficient (within a constant factor of linear speedup) when there is sufficient parallelism, <italic>w</italic>/<italic>d</italic> ≥ <italic>p</italic> log <italic>p</italic>. These are the first time bounds we know of for languages with speculative evaluation. The main challenge is in parallelizing the necessary queuing operations on suspended threads.

[1]  Guy E. Blelloch,et al.  Pipelining with Futures , 1997, SPAA.

[2]  Robert H. Halstead,et al.  New Ideas in Parallel Lisp: Language Design, Implementation, and Programming Tools , 1989, Workshop on Parallel Lisp.

[3]  Michel Parigot,et al.  Programming with Proofs: A Second Order Type Theory , 1988, ESOP.

[4]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[5]  Paul Hudak,et al.  Pomset interpretations of parallel functional programs , 1987, FPCA.

[6]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[7]  Guy E. Blelloch,et al.  Parallelism in sequential functional languages , 1995, FPCA '95.

[8]  Tremblay,et al.  The Impact of Laziness on Parallelism and the Limits of StrictnessAnalysisG , 1995 .

[9]  David Sands,et al.  Calculi for time analysis of functional programs , 1990 .

[10]  Robert H. Halstead,et al.  Mul-T: a high-performance parallel Lisp , 1989, PLDI '89.

[11]  Andrew Simon Partridge,et al.  Speculative evaluation in parallel implementations of lazy functional languages , 1991 .

[12]  Mads Rosendahl,et al.  Automatic complexity analysis , 1989, FPCA.

[13]  Luc Moreau,et al.  The PCKS-Machine: An Abstract Machine for Sound Evaluation of Parallel Functional Programs with First-Class Continuations , 1994, ESOP.

[14]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language , 1992 .

[15]  Lennart Augustsson,et al.  PH Language Reference Manual, Version 1.0---preliminary , 1995 .

[16]  Paul Hudak,et al.  Graphinators and the duality of SIMD and MIMD , 1988, LISP and Functional Programming.

[17]  Matthias Felleisen,et al.  A calculus for assignments in higher-order languages , 1987, POPL '87.

[18]  James S. Miller,et al.  Garbage Collection in MultiScheme , 1989, Workshop on Parallel Lisp.

[19]  Rishiyur S. Nikhil,et al.  The Parallel Programming Language Id and its Compilation for Parallel Machines , 1993, Int. J. High Speed Comput..

[20]  Paul Roe,et al.  Parallel programming using functional languages , 1991 .

[21]  Charles E. Leiserson,et al.  Space-efficient scheduling of multithreaded computations , 1993, SIAM J. Comput..

[22]  Marc Feeley,et al.  An efficient and general implementation of futures on large scale shared-memory multiprocessors , 1993 .

[23]  Wentong Cai,et al.  A Cost Calculus for Parallel Functional Programming , 1995, J. Parallel Distributed Comput..

[24]  David Callahan,et al.  A future-based parallel language for a general-purpose highly-parallel computer , 1990 .

[25]  Guy E. Blelloch,et al.  A provably time-efficient parallel implementation of full speculation , 1996, POPL '96.

[26]  Mike Joy,et al.  Parallel Combinator Reduction: Some Performance Bounds , 1992 .

[27]  Guy E. Blelloch,et al.  A provable time and space efficient implementation of NESL , 1996, ICFP '96.

[28]  Matthias Felleisen,et al.  The semantics of future and its use in program optimization , 1995, POPL '95.

[29]  John Greiner,et al.  Semantics-Based Parallel Cost Models and Their Use in Provably Efficient Implementations , 1997 .

[30]  Randy B. Osborne,et al.  Speculative computation in multilisp , 1989, LISP and Functional Programming.

[31]  Paul Roe,et al.  Calculating lenient programs' performance , 1990, Functional Programming.

[32]  Abhiram G. Ranade,et al.  Fluent parallel computation , 1989 .

[33]  Simon L. Peyton Jones,et al.  Parallel Implementations of Functional Programming Languages , 1989, Comput. J..

[34]  Bradley C. Kuszmaul,et al.  Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.

[35]  James R. Larus,et al.  Using the run-time sizes of data structures to guide parallel-thread creation , 1994, LFP '94.

[36]  Richard P. Gabriel,et al.  Qlisp: experience and new directions , 1988, PPoPP 1988.

[37]  A. H. Dekker,et al.  Speculative parallelism in a distributed graph reduction machine , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume II: Software Track.

[38]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[39]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[40]  K. R. Traub,et al.  Sequential implementation of lenient programming languages , 1988 .

[41]  Rex L. Page,et al.  Deleting Irrelevant Tasks in an Expression-Oriented Multiprocessor System , 1981, TOPL.

[42]  Chung-Kwong Yuen,et al.  Speculative parallelism in BaLinda Lisp , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.

[43]  Luc Moreau,et al.  The semantics of Scheme with future , 1996, ICFP '96.

[44]  W. Zimmermann,et al.  Complexity issues in the design of functional languages with explicit parallelism , 1992, Proceedings of the 1992 International Conference on Computer Languages.

[45]  Richard Kennaway,et al.  A Conflict Between Call-by-Need Computation and Parallelism , 1994, CTRS.

[46]  Uzi Vishkin,et al.  On Parallel Hashing and Integer Sorting (cid:3) , 1991 .

[47]  Gordon D. Plotkin,et al.  Call-by-Name, Call-by-Value and the lambda-Calculus , 1975, Theor. Comput. Sci..

[48]  Slocum Miller James,et al.  Multischeme : a parallel processing system based on MIT scheme , 1987 .

[49]  Uzi Vishkin,et al.  A note on reducing parallel model simulations to integer sorting , 1995, Proceedings of 9th International Parallel Processing Symposium.

[50]  Takayasu ITO,et al.  A Parallel Lisp Language PaiLisp and Its Kernel Specification , 1989, Workshop on Parallel Lisp.

[51]  P. J. Landin The Mechanical Evaluation of Expressions , 1964, Comput. J..

[52]  Yossi Matias,et al.  Fast and Efficient Simulations among CRCW PRAMs , 1994, J. Parallel Distributed Comput..

[53]  Paul Hudak,et al.  Garbage collection and task deletion in distributed applicative processing systems , 1982, LFP '82.

[54]  Guy E. Blelloch,et al.  Provably efficient scheduling for languages with fine-grained parallelism , 1999, JACM.

[55]  Anoop Gupta,et al.  COOL: a language for parallel programming , 1990 .

[56]  Carl Hewitt,et al.  The incremental garbage collection of processes , 1977 .