Semantics-Based Parallel Cost Models and Their Use in Provably Efficient Implementations

Abstract : Understanding the performance issues of modern programming language execution can be difficult. These languages have abstract features, such as higher-order functions, laziness, and objects, that ease programming, but which make their mapping to the underlying machine more difficult. Understanding parallel languages is further complicated by the need to describe what computations are performed in parallel and how they are affected by communication and latency in the machine. This lack of understanding can obscure even the asymptotic performance of a program and can also hide performance bugs in the language implementation. The dissertation introduces a framework of provably efficient implementations in which performance issues of a language can be defined and analyzed. We define several language models, each consisting of an operational semantics augmented with the costs of execution. In particular, the dissertation examines three functional languages based on fork-and-join parallelism, speculative parallelism, and data-parallelism, and it examines their time and space costs. We then define implementations of each language model onto several common machine models, prove these implementations correct, and derive their costs. Each of these implementations uses an intermediate model based on an abstract machine to stage the overall implementation. The abstract machine executes a series of steps transforming a stack of active states and store into new states and store. The dissertation proves the efficiency of the implementation by relating the steps to the parallel traversal of a computation graph defined in the augmented operational semantics. Provably efficient implementations are useful for programmers, language implementors and language designers.

[1]  Guy E. Blelloch,et al.  NESL: A Nested Data-Parallel Language (Version 2.6) , 1993 .

[2]  P. J. Landin The Mechanical Evaluation of Expressions , 1964, Comput. J..

[3]  Yossi Matias,et al.  Fast and Efficient Simulations among CRCW PRAMs , 1994, J. Parallel Distributed Comput..

[4]  Paul Hudak,et al.  Garbage collection and task deletion in distributed applicative processing systems , 1982, LFP '82.

[5]  Robin Milner,et al.  Definition of standard ML , 1990 .

[6]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[7]  F. Warren Burton Guaranteeing Good Memory Bound for Parallel Programs , 1996, IEEE Trans. Software Eng..

[8]  Abhiram G. Ranade,et al.  Fluent parallel computation , 1989 .

[9]  Guy E. Blelloch,et al.  Parallelism in sequential functional languages , 1995, FPCA '95.

[10]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[11]  Michael T. Goodrich,et al.  Sorting on a parallel pointer machine with applications to set expression evaluation , 1996, JACM.

[12]  J. Knopp,et al.  Touching analysis: avoiding runtime checking in future-based parallel languages , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[13]  David Callahan,et al.  A future-based parallel language for a general-purpose highly-parallel computer , 1990 .

[14]  Leslie G. Valiant,et al.  A Scheme for Fast Parallel Communication , 1982, SIAM J. Comput..

[15]  James Riely,et al.  Provably correct vectorization of nested-parallel programs , 1995, Programming Models for Massively Parallel Computers.

[16]  Bruce M. Maggs,et al.  Randomized Routing and Sorting on Fixed-Connection Networks , 1994, J. Algorithms.

[17]  Peter Lee,et al.  An automatically generated, realistic compiler for imperative programming language , 1988, PLDI '88.

[18]  Matthias Felleisen,et al.  On the Expressive Power of Programming Languages , 1990, ESOP.

[19]  Luc Moreau The semantics of Scheme with future , 1996, ICFP '96.

[20]  Patrick M. Sansom,et al.  Execution profiling for non-strict functional languages , 1994 .

[21]  Jonathan Rees,et al.  Revised3 report on the algorithmic language scheme , 1986, SIGP.

[22]  Eli Upfal,et al.  Parallel hashing: an efficient implementation of shared memory , 1988, JACM.

[23]  Michel Parigot,et al.  Programming with Proofs: A Second Order Type Theory , 1988, ESOP.

[24]  Robert Paige,et al.  Real-time Simulation of a Set Machine on a Ram , 1989 .

[25]  Robert H. Halstead,et al.  MULTILISP: a language for concurrent symbolic computation , 1985, TOPL.

[26]  David Sands,et al.  Time Analysis, Cost Equivalence and Program Refinement , 1991, FSTTCS.

[27]  John Greiner,et al.  A comparison of parallel algorithms for connected components , 1994, SPAA '94.

[28]  Tremblay,et al.  The Impact of Laziness on Parallelism and the Limits of StrictnessAnalysisG , 1995 .

[29]  David Sands,et al.  Calculi for time analysis of functional programs , 1990 .

[30]  Paul Roe Parallel programming using functional languages , 1991 .

[31]  Uzi Vishkin,et al.  On Parallel Hashing and Integer Sorting , 1991, J. Algorithms.

[32]  Slocum Miller James,et al.  Multischeme : a parallel processing system based on MIT scheme , 1987 .

[33]  Richard Cole,et al.  Deterministic Coin Tossing with Applications to Optimal Parallel List Ranking , 2018, Inf. Control..

[34]  Richard M. Karp,et al.  Parallel Algorithms for Shared-Memory Machines , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[35]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[36]  Jacques Cohen,et al.  Computer-assisted microanalysis of programs , 1982, CACM.

[37]  Robert Todd Hood,et al.  The Efficient Implementation of Very-high-level Programming Language Constructs , 1982 .

[38]  Guy E. Blelloch,et al.  Scans as Primitive Parallel Operations , 1989, ICPP.

[39]  Andrew W. Appel,et al.  Compiling with Continuations , 1991 .

[40]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[41]  Daniel P. Friedman,et al.  Aspects of Applicative Programming for Parallel Processing , 1978, IEEE Transactions on Computers.

[42]  Simon L. Peyton Jones,et al.  Parallel Implementations of Functional Programming Languages , 1989, Comput. J..

[43]  Ben Wegbreit,et al.  Mechanical program analysis , 1975, CACM.

[44]  Uzi Vishkin,et al.  Towards a theory of nearly constant time parallel algorithms , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[45]  James R. Larus,et al.  Using the run-time sizes of data structures to guide parallel-thread creation , 1994, LFP '94.

[46]  K. R. Traub,et al.  Sequential implementation of lenient programming languages , 1988 .

[47]  Rex L. Page,et al.  Deleting Irrelevant Tasks in an Expression-Oriented Multiprocessor System , 1981, TOPL.

[48]  Egon Börger,et al.  Correctness of Compiling Occam to Transputer Code , 1996, Comput. J..

[49]  Chung-Kwong Yuen,et al.  Speculative parallelism in BaLinda Lisp , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.

[50]  Robert E. Tarjan,et al.  Making data structures persistent , 1986, STOC '86.

[51]  Douglas J. Gurr Semantic frameworks for complexity , 1990 .

[52]  David C. Cann,et al.  A Report on the Sisal Language Project , 1990, J. Parallel Distributed Comput..

[53]  Daniel Le Métayer Mechanical analysis of program complexity , 1985, ACM SIGPLAN Notices.

[54]  Mike Joy,et al.  Parallel Combinator Reduction: Some Performance Bounds , 1992 .

[55]  W. Zimmermann,et al.  Complexity issues in the design of functional languages with explicit parallelism , 1992, Proceedings of the 1992 International Conference on Computer Languages.

[56]  Lennart Augustsson,et al.  PH Language Reference Manual, Version 1.0---preliminary , 1995 .

[57]  Paul Hudak,et al.  Graphinators and the duality of SIMD and MIMD , 1988, LFP '88.

[58]  D. L. Métayer,et al.  Mechanical analysis of program complexity , 1985, SLIPE '85.

[59]  Matthias Felleisen,et al.  A calculus for assignments in higher-order languages , 1987, POPL '87.

[60]  Guy E. Blelloch,et al.  Developing a practical projection-based parallel Delaunay algorithm , 1996, SCG '96.

[61]  Dan Suciu,et al.  Efficient compilation of high-level data parallel algorithms , 1994, SPAA '94.

[62]  S. Brookes,et al.  Applications of Categories in Computer Science: Computational comonads and intensional semantics , 1992 .

[63]  Lawrence C. Paulson A semantics-directed compiler generator , 1982, POPL '82.

[64]  Peter Lee,et al.  A realistic compiler generator based on high-level semantics: another progress report , 1987, POPL '87.

[65]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[66]  Zvi Galil,et al.  On pointers versus addresses , 1992, JACM.

[67]  Jürgen Knopp,et al.  Improving the Performance of Parallel LISP by Compile Time Analysis , 1992, CC.

[68]  Wentong Cai,et al.  Calculating Recurrences Using the Bird-Meertens Formalism , 1995, Parallel Process. Lett..

[69]  Robert H. Halstead,et al.  New Ideas in Parallel Lisp: Language Design, Implementation, and Programming Tools , 1989, Workshop on Parallel Lisp.

[70]  Carl Hewitt,et al.  The incremental garbage collection of processes , 1977 .

[71]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[72]  Guy E. Blelloch,et al.  Programming parallel algorithms , 1996, CACM.

[73]  Uzi Vishkin,et al.  A note on reducing parallel model simulations to integer sorting , 1995, Proceedings of 9th International Parallel Processing Symposium.

[74]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .

[75]  David B. Skillicorn,et al.  The Bird-Meertens Formalism as a Parallel Model , 1993 .

[76]  Randy B. Osborne Speculative Computation in Multilisp , 1989, Workshop on Parallel Lisp.

[77]  John C. Mitchell On Abstraction and the Expressive Power of Programming Languages , 1991, Sci. Comput. Program..

[78]  Carolyn L. Talcott Rum. An Intensional Theory of Function and Control Abstractions , 1986, Foundations of Logic and Functional Programming.

[79]  Robert H. Halstead,et al.  Mul-T: a high-performance parallel Lisp , 1989, PLDI '89.

[80]  Rishiyur S. Nikhil The Parallel Programming Language Id and its Compilation for Parallel Machines , 1993, Int. J. High Speed Comput..

[81]  Rüdiger Reischuk Probabilistic Parallel Algorithms for Sorting and Selection , 1985, SIAM J. Comput..

[82]  Wolf Zimmermann,et al.  The automatic complexity analysis of divide-and-conquer algorithms , 1989 .

[83]  Helmut Seidl,et al.  Probabilistic load balancing for parallel graph reduction , 1989, Fourth IEEE Region 10 International Conference TENCON.

[84]  Simon L. Peyton Jones,et al.  Time and space profiling for non-strict, higher-order functional languages , 1995, POPL '95.

[85]  David Sands,et al.  Complexity Analysis for a Lazy Higher-Order Language , 1989, Functional Programming.

[86]  Luc Moreau The PCKS-Machine: An Abstract Machine for Sound Evaluation of Parallel Functional Programs with First-Class Continuations , 1994, ESOP.

[87]  Jonathan C. Shultis On the Complexity of Higher-Order Programs ; CU-CS-288-85 , 1985 .

[88]  Marc Feeley,et al.  An efficient and general implementation of futures on large scale shared-memory multiprocessors , 1993 .

[89]  Paul Roe Calculating lenient programs' performance , 1990, Functional Programming.

[90]  Daniel Le Métayer,et al.  ACE: an automatic complexity evaluator , 1988, TOPL.

[91]  John H. Reif,et al.  Prototyping parallel and distributed programs in Proteus , 1991, Proceedings of the Third IEEE Symposium on Parallel and Distributed Processing.

[92]  Wentong Cai,et al.  A Cost Calculus for Parallel Functional Programming , 1995, J. Parallel Distributed Comput..

[93]  VishkinUzi,et al.  Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories , 1984 .

[94]  A. H. Dekker,et al.  Speculative parallelism in a distributed graph reduction machine , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume II: Software Track.

[95]  Andrew Simon Partridge Speculative evaluation in parallel implementations of lazy functional languages , 1991 .

[96]  Mads Rosendahl,et al.  Automatic complexity analysis , 1989, FPCA.