Restructuring programs for high-speed computers with Polaris

The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a new parallelizing compiler that overcomes limitations of current compilers. While current parallelizing compilers may succeed on small kernels, they often fail to extract any meaningful parallelism from whole applications. After a study of application codes, it was concluded that by adding a few new techniques to current compilers, automatic parallelization becomes feasible for a range of whole applications. The techniques needed are interprocedural analysis, scalar and array privatization, symbolic dependence analysis, and advanced induction and reduction recognition and elimination, along with run-time techniques to permit the parallelization of loops with unknown dependence relations.

[1]  Rudolf Eigenmann,et al.  Idiom recognition in the Polaris parallelizing compiler , 1995, ICS '95.

[2]  Rudolf Eigenmann,et al.  Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[3]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[4]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[5]  Yunheung Paek,et al.  Automatic Parallelization for Non-cache Coherent Multiprocessors , 1996, LCPC.

[6]  Lawrence Rauchwerger,et al.  The privatizing DOALL test: a run-time technique for DOALL loop identification and array privatization , 1994, ICS '94.

[7]  David A. Padua,et al.  Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs , 1991, LCPC.

[8]  Rudolf Eigenmann,et al.  Parallelization in the Presence of Generalized Induction and Reduction Variables , 1995 .

[9]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[10]  David A. Padua,et al.  Static and dynamic evaluation of data dependence analysis , 1993, ICS '93.

[11]  David A. Padua,et al.  Restructuring Fortran programs for Cedar , 1993, Concurr. Pract. Exp..

[12]  David A. Padua,et al.  Gated SSA-based demand-driven symbolic analysis for parallelizing compilers , 1995, ICS '95.

[13]  David A. Padua,et al.  Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.

[14]  Rudolf Eigenmann,et al.  An Overview of Symbolic Analysis Techniques Needed for the Effective Parallelization of the Perfect Benchmarks , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[15]  Rudolf Eigenmann,et al.  The range test: a dependence test for symbolic, non-linear expressions , 1994, Proceedings of Supercomputing '94.

[16]  Ken Kennedy,et al.  Interprocedural transformations for parallel code generation , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).