Polynomial Time Array Dataflow Analysis

Array dataflow analysis is a valuable tool for supercomputer compilers. However, the worst-case asymptotic time complexities for modern array dataflow analysis techniques are either not well understood or alarmingly high. For example, the Omega Test uses a subset of the 222O(n) language of Presburger Arithmetic for analysis of affine dependences; its use of uninterpreted function symbols for nonaffine terms introduces additional sources of complexity. Even traditional data dependence analysis of affine dependences is equivalent to integer programming, and is thus NP-complete. These worst-case complexities have raised questions about the wisdom of using array dataflow analysis in a production compiler, despite empirical data that show that various tests run quickly in practice. In this paper, we demonstrate that a polynomial-time algorithm can produce accurate information about the presence of loop-carried array dataflow. We first identify a subdomain of Presburger Arithmetic that can be manipulated (by the Omega Library) in polynomial time; we then describe a modification to prevent exponential blowup of the Omega Library's algorithm for manipulating function symbols. Restricting the Omega Test to these polynomial cases can, in principle, reduce the accuracy of the dataflow information produced. We therefore present the results of our investigation of the effects of these restrictions on the detection of loop-carried array dataflow dependences (which prevent parallelization). These restrictions block parallelization of only a few unimportant loop nests in the approximately 18000 lines of benchmark code we studied. The use of our subdomain of Presburger Arithmetic also gives a modest reduction in analysis time, even with our current unoptimized implementation, as long as we do not employ our modified algorithms for function symbols. The data collected in our empirical studies also suggest directions for improving both accuracy and efficiency.

[1]  Saman Amarasinghe,et al.  Parallelizing Compiler Techniques Based on Linear Inequalities , 1997 .

[2]  Hudson Benedito Ribas Obtaining Dependence Vectors for Nested-Loop Computations , 1990, ICPP.

[3]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.

[4]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[5]  William Pugh,et al.  The Omega test: A fast and practical integer programming algorithm for dependence analysis , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[6]  Thomas Brandes The importance of direct dependences for automatic parallelization , 1988, ICS '88.

[7]  William Pugh,et al.  Constraint-based array dependence analysis , 1998, TOPL.

[8]  Derek C. Oppen,et al.  A 2^2^2^pn Upper Bound on the Complexity of Presburger Arithmetic , 1978, J. Comput. Syst. Sci..

[9]  M. Fischer,et al.  SUPER-EXPONENTIAL COMPLEXITY OF PRESBURGER ARITHMETIC , 1974 .

[10]  Paul Feautrier Toward Automatic Distribution , 1994, Parallel Process. Lett..

[11]  Ken Kennedy,et al.  Incremental dependence analysis , 1990 .

[12]  Monica S. Lam,et al.  Interprocedural Analysis for Parallelization , 1995, LCPC.

[13]  William Pugh,et al.  An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.

[14]  Dror Eliezer Maydan Accurate analysis of array references , 1993 .

[15]  Sungdo Moon,et al.  Evaluation of predicated array data-flow analysis for automatic parallelization , 1999, PPoPP '99.

[16]  Robert E. Shostak,et al.  A Practical Decision Procedure for Arithmetic with Function Symbols , 1979, JACM.

[17]  Thomas R. Gross,et al.  Structured dataflow analysis for arrays and its use in an optimizing compiler , 1990, Softw. Pract. Exp..

[18]  Olgierd Wojtasiewicz,et al.  Elements of mathematical logic , 1964 .

[19]  Monica S. Lam,et al.  Efficient and exact data dependence analysis , 1991, PLDI '91.

[20]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[21]  David G. Wonnacott,et al.  Using time skewing to eliminate idle time due to memory bandwidth and network limitations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[22]  William Pugh,et al.  Nonlinear array dependence analysis , 1994 .

[23]  William Pugh,et al.  Eliminating false data dependences using the Omega test , 1992, PLDI '92.

[24]  Gyungho Lee,et al.  Symbolic Array Dataflow Analysis for Array Privatization and Program Parallelization , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[25]  Zhiyuan Li,et al.  Experience with efficient array data flow analysis for array privatization , 1997, PPOPP '97.

[26]  David A. Padua,et al.  Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs , 1991, LCPC.

[27]  Peng Tu,et al.  Automatic array privatization and demand-driven symbolic analysis , 1996 .

[28]  Zhiyuan Li,et al.  Array privatization for parallel execution of loops , 1992 .

[29]  Sungdo Moon,et al.  Predicated array data-flow analysis for run-time parallelization , 1998, ICS '98.