Accurate analysis of array references

Modern computer systems are increasingly relying on parallelism to improve performance. Automatic parallelization techniques offer the hope that users can simply and portably exploit parallelism. This thesis addresses the problem of data dependence analysis, the base step in detecting loop level parallelism in scientific programs. Exploiting parallelism can change the order of memory operations. Data dependence analysis involves analyzing the dynamic memory reference behavior of array operations so that compilers will only parallelize loops in the cases where any resultant reordering of memory references does not change the sequential semantics of the program. In general, data dependence analysis is undecidable, and compilers must conservatively approximate array reference behavior, thus sequentializing parallel loops. Traditional data dependence analysis research has concentrated on the simpler problem of affine memory disambiguation. Many algorithms have been developed that conservatively approximate even this simpler problem. By using a series of algorithms, each one guaranteed to be exact for a certain class of input, we are able to devise a new method that in practice solves exactly and efficiently the affine memory disambiguation problem. Because our affine memory disambiguator is exact in practice, we can devise an experiment to test the effectiveness of affine memory disambiguation at approximating the full data dependence problem. We discover that the lack of data-flow information on array elements is the key limitation of affine memory disambiguators. We develop a new representation and algorithm to efficiently calculate these data-flow dependences. Finally, we address the problem of interprocedural data dependence analysis. By using an array summary representation that is guaranteed to be exact when applicable, we can combine summary information with inlining to exactly and efficiently analyze affine array references across procedure boundaries. Taken together, our algorithms generate the more accurate information that will be needed to exploit parallelism in the future.

[1]  James R. Larus Estimating the Potential Parallelism in Programs , 1991 .

[2]  David R. Wallace,et al.  Dependence of multi-dimensional array references , 1988, ICS '88.

[3]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[4]  Robert E. Shostak,et al.  Deciding Linear Inequalities by Computing Loop Residues , 1981, JACM.

[5]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[6]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[7]  Steven W. K. Tjiang,et al.  Sharlit—a tool for building optimizers , 1992, PLDI '92.

[8]  P. Feautrier Parametric integer programming , 1988 .

[9]  Hendrik W. Lenstra,et al.  Integer Programming with a Fixed Number of Variables , 1983, Math. Oper. Res..

[10]  Kleanthis Psarris,et al.  The I Test: An Improved Dependence Test for Automatic Parallelization and Vectorization , 1991, IEEE Trans. Parallel Distributed Syst..

[11]  Zhiyu Shen,et al.  An Empirical Study on Array Subscripts and Data Dependencies , 1989, ICPP.

[12]  Zhiyuan Li,et al.  An Efficient Data Dependence Analysis for Parallelizing Compilers , 1990, IEEE Trans. Parallel Distributed Syst..

[13]  Vadim Maslov,et al.  Delinearization: an efficient way to break multiloop dependence equations , 1992, PLDI '92.

[14]  Corinne Ancourt,et al.  Scanning polyhedra with DO loops , 1991, PPOPP '91.

[15]  Monica S. Lam,et al.  Efficient and exact data dependence analysis , 1991, PLDI '91.

[16]  William Pugh,et al.  Eliminating false data dependences using the Omega test , 1992, PLDI '92.

[17]  Chau-Wen Tseng,et al.  The Power Test for Data Dependence , 1992, IEEE Trans. Parallel Distributed Syst..

[18]  Alexander V. Veidenbaum,et al.  Detecting redundant accesses to array data , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[19]  Ron Cytron,et al.  Interprocedural dependence analysis and parallelization , 1986, SIGP.

[20]  Ravi Kannan,et al.  Minkowski's Convex Body Theorem and Integer Programming , 1987, Math. Oper. Res..

[21]  David L. Kuck,et al.  The Structure of Computers and Computations , 1978 .

[22]  Mary Hall Managing interprocedural optimization , 1992 .

[23]  Alain Lichnewsky,et al.  Introducing symbolic problem solving techniques in the dependence testing phases of a vectorizer , 1988, ICS '88.

[24]  David A. Padua,et al.  Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs , 1991, LCPC.

[25]  Utpal Banerjee,et al.  Speedup of ordinary programs , 1979 .

[26]  Zhiyuan Li,et al.  Array privatization for parallel execution of loops , 1992 .

[27]  David A. Padua,et al.  Dependence graphs and compiler optimizations , 1981, POPL '81.

[28]  Hudson Benedito Ribas Obtaining Dependence Vectors for Nested-Loop Computations , 1990, ICPP.

[29]  George B. Dantzig,et al.  Fourier-Motzkin Elimination and Its Dual , 1973, J. Comb. Theory, Ser. A.

[30]  Ken Kennedy,et al.  Automatic translation of FORTRAN programs to vector form , 1987, TOPL.

[31]  Gerald J. Sussman,et al.  Structure and interpretation of computer programs , 1985, Proceedings of the IEEE.

[32]  P. Feautrier Array expansion , 1988 .

[33]  Ken Kennedy,et al.  An Implementation of Interprocedural Bounded Regular Section Analysis , 1991, IEEE Trans. Parallel Distributed Syst..

[34]  Ken Kennedy,et al.  Incremental dependence analysis , 1990 .

[35]  Ken Kennedy,et al.  Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.

[36]  Thomas R. Gross,et al.  Structured dataflow analysis for arrays and its use in an optimizing compiler , 1990, Softw. Pract. Exp..

[37]  Robert E. Tarjan,et al.  Data structures and network algorithms , 1983, CBMS-NSF regional conference series in applied mathematics.