Execution of automatically parallelized APL programs on RP3

We have implemented an experimental APUC compiler, which accepts ordinary APL programs and produces C programs. We have also implemented a run-time environment that supports the parallel execution of these C programs on the RP3 computer, a sharedmemory, 64-way MIMD machine built at the IBM Thomas J. Watson Research Center. The APUC compiler uses the front end of the APL/370 compiler and imposes the same restrictions, but requires no parallelizatlon directives from the user. The run-time environment is based on simple synchronization primitives and Is implemented using Mach threads. We report the speedups of several compiled programs running on RP3 under the Mach operating system. The current implementation exploits only data parallelism. We discuss the relationship between the style of an APL program and its expected benefit from the automatic parallel execution provided by our compiler. Introduction During the past decade, there has been a tremendous amount of research on parallel processing intended to speed up the execution of applications. Confronted with a new parallel machine, a programmer usually has to prepare the application program by hand for parallel execution, using either a new parallel-programming language or some conventional language with added parallel constructs. The reported successful utilization of the Connection Machine [1] and of the GF-11 [2] has been achieved in this way. It is interesting to note that these two parallel machines are both SIMD (single-instruction-stream, multiple-datastream) type.

[1]  Roy Dz-Ching Ju,et al.  Exploitation of APL data parallelism on a shared-memory MIMD machine , 1991, PPOPP '91.

[2]  Robert Bernecky,et al.  ACORN : APL to C on real numbers , 1990 .

[3]  Ron Cytron,et al.  An Overview of the PTRAN Analysis System for Multiprocessing , 1988, J. Parallel Distributed Comput..

[4]  Edith Schonberg,et al.  Low-overhead scheduling of nested parallelism , 1991, IBM J. Res. Dev..

[5]  Alexander V. Veidenbaum,et al.  The effect of restructing compilers on program performance for high-speed computers☆ , 1985 .

[6]  Boleslaw K. Szymanski,et al.  Parallel functional languages and compilers , 1991 .

[7]  Vivek Sarkar,et al.  Determining average program execution times and their variance , 1989, PLDI '89.

[8]  Steve L. Karman,et al.  Benchmark calculations with an unstructured grid flow solver on a SIMD computer , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[9]  Alain Guillon An APL compiler: The SOFREMI-AGL compiler, a tool to produce low-cost efficient software , 1987, APL '87.

[10]  Raymond M. Bryant,et al.  Operating system support for parallel programming on RP3 , 1991, IBM J. Res. Dev..

[11]  Wai-Mee Ching,et al.  Program Analysis and Code Generation in an APL/370 Compiler , 1986, IBM J. Res. Dev..

[12]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[13]  Michael G. Burke An interval-based approach to exhaustive and incremental interprocedural data-flow analysis , 1990, TOPL.

[14]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[15]  Donald L. Orth,et al.  Compiling APL: The Yorktown APL Translator , 1986, IBM J. Res. Dev..

[16]  Wai-Mee Ching,et al.  Evon: An Extended von Neumann Model for Parallel Processing , 1986, FJCC.

[17]  Timothy A. Budd An APL compiler for the UNIX timesharing system , 1983 .