The Dude Runtime System: an Object-oriented Macro-dataaow Approach to Integrated Task and Object Parallelism University of Colorado at Boulder the Dude Runtime System: an Object-oriented Macro-dataaow Approach to Integrated Task and Object Parallelism

Modern parallel programming languages allow programmers to specify parallelism using implicitly parallel constructs such as data parallel or object parallel methods, and explicitly parallel constructs, such as doall, doacross, parallel section or programmer-level threads. In this paper, we present the design of a runtime system that executes data-parallel (or object-parallel) code in the presence of explicit parallelism. This facilitates load balancing between data-parallel computations running in threads of distinct parallel sections, as well as inter-loop load balancing. Although suucient runtime structure is provided for most extant languages, the runtime system is extensible, allowing compilers to customize the runtime system. To motivate why such a runtime system is desirable, we use show performance improvements for programs with complex data dependence relations, such as multigrid solvers.

[1]  Irad Yavneh,et al.  Multigrid solution of stably stratified flows: The quasigeostrophic equations , 1996 .

[2]  Dirk Grunwald,et al.  Efficient barriers for distributed shared memory computers , 1994, Proceedings of 8th International Parallel Processing Symposium.

[3]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[4]  Laxmikant V. Kalé,et al.  CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.

[5]  Prithviraj Banerjee,et al.  Processor Allocation and Scheduling of Macro Dataflow Graphs on Distributed Memory Multicomputers by the PARADIGM Compiler , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[6]  Steven Lucco,et al.  Orchestrating interactions among parallel computations , 1993, PLDI '93.

[7]  Dirk Grunwald,et al.  Data flow equations for explicitly parallel programs , 1993, PPOPP '93.

[8]  John Zahorjan,et al.  Chores: enhanced run-time support for shared-memory parallel computing , 1993, TOCS.

[9]  Dirk Grunwald,et al.  Array Section Analysis for Control Parallel Programs ; CU-CS-684-93 , 1993 .

[10]  F. Mueller Pthreads Library Interface , 1993 .

[11]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .

[12]  Evangelos P. Markatos,et al.  Load Balancing vs. Locality Management in Shared-Memory Multiprocessors , 1992, ICPP.

[13]  Dirk Grunwald A users guide to awesime: an object oriented parallel programming and simulation system , 1991 .

[14]  Vasanth Balasundaram A Mechanism for Keeping Useful Internal Information in Parallel Programming Tools: The Data Access Descriptor , 1990, J. Parallel Distributed Comput..

[15]  Gregory M. Papadopoulos,et al.  Implementation of a general purpose dataflow multiprocessor , 1991 .

[16]  Brian N. Bershad,et al.  PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..

[17]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[18]  Paul O. Frederickson,et al.  Parallel Superconvergent Multigrid , 1987 .

[19]  Processor Self-Scheduling for Multiple-Nested Parallel Loops , 1986, ICPP.

[20]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[21]  Robert G. Babb,et al.  Parallel Processing with Large-Grain Data Flow Techniques , 1984, Computer.

[22]  L. Adams Iterative algorithms for large sparse linear systems on parallel computers , 1983 .

[23]  D. Young Iterative methods for solving partial difference equations of elliptic type , 1954 .