Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler

This paper presents an extensive empirical evaluation of an interprocedural parallelizing compiler, developed as part of the Stanford SUIF compiler system. The system incorporates a comprehensive and integrated collection of analyses, including privatization and reduction recognition for both array and scalar variables, and symbolic analysis of array subscripts. The interprocedural analysis framework is designed to provide analysis results nearly as precise as full inlining but without its associated costs. Experimentation with this system shows that it is capable of detecting coarser granularity of parallelism than previously possible. Specifically, it can parallelize loops that span numerous procedures and hundreds of lines of codes, frequently requiring modifications to array data structures such as privatization and reduction transformations. Measurements from several standard benchmark suites demonstrate that an integrated combination of interprocedural analyses can substantially advance the capability of automatic parallelization technology.

[1]  Monica S. Lam,et al.  Interprocedural Analysis for Parallelization , 1995, LCPC.

[2]  Chau-Wen Tseng,et al.  Compiler optimizations for eliminating barrier synchronization , 1995, PPOPP '95.

[3]  Monica S. Lam,et al.  Data and computation transformations for multiprocessors , 1995, PPOPP '95.

[4]  Mary Hall,et al.  Interprocedural analysis for parallelization: design and experience , 1995 .

[5]  Paul Havlak,et al.  Interprocedural symbolic analysis , 1995 .

[6]  Rudolf Eigenmann,et al.  The range test: a dependence test for symbolic, non-linear expressions , 1994, Proceedings of Supercomputing '94.

[7]  Samuel P. Midkiff,et al.  An Empirical Study of Precise Interprocedural Array Analysis , 1994, Sci. Program..

[8]  Lawrence Rauchwerger,et al.  Polaris: Improving the Effectiveness of Parallelizing Compilers , 1994, LCPC.

[9]  John M. Mellor-Crummey,et al.  FIAT: A Framework for Interprocedural Analysis and Transfomation , 1993, LCPC.

[10]  David A. Padua,et al.  Automatic Array Privatization , 1993, Compiler Optimizations for Scalable Parallel Systems Languages.

[11]  Constantine D. Polychronopoulos,et al.  Symbolic Analysis: A Basis for Parallelization, Optimization, and Scheduling of Programs , 1993, LCPC.

[12]  K. Cooper,et al.  A Methodology for Procedure Cloning , 1993, Comput. Lang..

[13]  François Irigoin Interprocedural analyses for programming environments , 1993 .

[14]  Rudolf Eigenmann,et al.  Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..

[15]  Barbara G. Ryder,et al.  A safe approximate algorithm for interprocedural aliasing , 1992, PLDI '92.

[16]  Vadim Maslov,et al.  Delinearization: an efficient way to break multiloop dependence equations , 1992, PLDI '92.

[17]  Gerald Baumgartner,et al.  Languages and Compilers for Parallel Computing , 1992, Lecture Notes in Computer Science.

[18]  Michael E. Wolf,et al.  Improving locality and parallelism in nested loops , 1992 .

[19]  Ken Kennedy,et al.  An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..

[20]  Pierre Jouvelot,et al.  Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .

[21]  Monica S. Lam,et al.  Efficient and exact data dependence analysis , 1991, PLDI '91.

[22]  Ken Kennedy,et al.  Practical dependence testing , 1991, PLDI '91.

[23]  Zhiyuan Li,et al.  Interprocedural Analysis for Parallel Programs , 1988, ICPP.

[24]  Zhiyuan Li,et al.  Efficient interprocedural analysis for program parallelization and restructuring , 1988, PPEALS '88.

[25]  Paul Feautrier,et al.  Direct parallelization of call statements , 1986, SIGPLAN '86.

[26]  Eugene W. Myers,et al.  A precise inter-procedural data flow algorithm , 1981, POPL '81.