Parallelization of FORTRAN code on distributed-memory parallel processors

This paper presents some preliminary results toward the automatic parallelization of uniprocessor FORTRAN code on distributed-memory parallel processors (DMPPs). The paper introduces Oxygen, a compiler for a DMPP under development at the Laboratory. The design of Oxygen and its parallelization strategy are discussed, and an analysis of its most significant components is presented, together with performance benchmarks. Oxygen carries out data consistency analysis at run-time; our results show that the overhead introduced is acceptable. Run-time data consistency analysis may also be the only viable approach to parallelize certain “hard” algorithms, as we will show in this study.

[1]  H. T. Kung,et al.  The Domain Parallel Computation Model On Warp , 1989, Optics & Photonics.

[2]  Marco Annaratone,et al.  K9: a simulator of distributed-memory parallel processors , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[3]  Shekhar Y. Borkar,et al.  iWarp: an integrated solution to high-speed parallel computing , 1988, Proceedings. SUPERCOMPUTING '88.

[4]  J. Cole,et al.  Calculation of plane steady transonic flows , 1970 .

[5]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[6]  Vivek Sarkar,et al.  Automatic discovery of parallelism: a tool and an experiment (extended abstract) , 1988, PPoPP 1988.

[7]  Alexander V. Veidenbaum,et al.  EFFECTS OF PROGRAM RESTRUCTURING, ALGORITHM CHANGE, AND ARCHITECTURE CHOICE ON PROGRAM PERFORMANCE. , 1984 .

[8]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[9]  Constantine D. Polychronopoulos,et al.  Parallel programming and compilers , 1988 .

[10]  Harry F. Jordan,et al.  Force User's Manual , 1987 .

[11]  I. Foster,et al.  Strand: A practical parallel programming language , 1989 .

[12]  Piyush Mehrotra Programming Parallel Architectures: The BLAZE Family of Languages-Invited Talk , 1987, PPSC.

[13]  M. Annaratone,et al.  Interprocessor communication speed and performance in distributed-memory parallel processors , 1989, ISCA '89.

[14]  P.-S. Tseng,et al.  A parallelizing compiler for distributed memory parallel computers , 1989, PLDI 1989.

[15]  Thomas R. Gross,et al.  Compilation for a high-performance systolic array , 1986, SIGPLAN '86.

[16]  Kiyoshi Nakabayashi,et al.  The K2 parallel processor: architecture and hardware implementation , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[17]  Milind Girkar,et al.  Parafrase-2: an Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors , 1989, Int. J. High Speed Comput..

[18]  Harry Berryman,et al.  Run-Time Scheduling and Execution of Loops on Message Passing Machines , 1990, J. Parallel Distributed Comput..

[19]  Vivek Sarkar,et al.  Automatic Discovery of Parallelism: A Tool and an Experiment (Extended Abstract) , 1988, PPOPP/PPEALS.

[20]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[21]  Nicholas Carriero,et al.  Linda and Friends , 1986, Computer.

[22]  Piyush Mehrotra,et al.  Parallel language constructs for tensor product computations on loosely coupled architectures , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).