Complete removal of redundant expressions

Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loop-invariant computations. Because existing PRE implementations are based on code motion, they fail to completely remove the redundancies. In fact, we observed that 73% of loop-invariant statements cannot be eliminated from loops by code motion alone. In dynamic terms, traditional PRE eliminates only half of redundancies that are strictly partial. To achieve a complete PRE, control flow restructuring must be applied. However, the resulting code duplication may cause code size explosion.This paper focuses on achieving a complete PRE while incurring an acceptable code growth. First, we present an algorithm for complete removal of partial redundancies, based on the integration of code motion and control flow restructuring. In contrast to existing complete techniques, we resort to restructuring merely to remove obstacles to code motion, rather than to carry out the actual optimization.Guiding the optimization with a profile enables additional code growth reduction through selecting those duplications whose cost is justified by sufficient execution-time gains. The paper develops two methods for determining the optimization benefit of restructuring a program region, one based on path-profiles and the other on data-flow frequency analysis. Furthermore, the abstraction underlying the new PRE algorithm enables a simple formulation of speculative code motion guaranteed to have positive dynamic improvements. Finally, we show how to balance the three transformations (code motion, restructuring, and speculation) to achieve a near-complete PRE with very little code growth.We also present algorithms for efficiently computing dynamic benefits. In particular, using an elimination-style data-flow framework, we derive a demand-driven frequency analyzer whose cost can be controlled by permitting a bounded degree of conservative imprecision in the solution.

[1]  G. Ramalingam Data flow frequency analysis , 1996, PLDI '96.

[2]  Rajiv Gupta,et al.  A practical framework for demand-driven interprocedural data flow analysis , 1997, TOPL.

[3]  Raymond Lo,et al.  A new algorithm for partial redundancy elimination based on SSA form , 1997, PLDI '97.

[4]  John Cocke,et al.  A program data flow analysis procedure , 1976, CACM.

[5]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[6]  Manfred P. Stadel,et al.  A variation of Knoop, Rüthing, and Steffen's Lazy Code Motion , 1993, SIGP.

[7]  Bernhard Steffen,et al.  Property-Oriented Expansion , 1996, SAS.

[8]  James R. Larus,et al.  Improving data-flow analysis with path profiles , 1998, PLDI.

[9]  Dhananjay M. Dhamdhere Practical adaption of the global optimization algorithm of Morel and Renvoise , 1991, TOPL.

[10]  Keith D. Cooper,et al.  Effective partial redundancy elimination , 1994, PLDI '94.

[11]  Rajiv Gupta,et al.  Path-sensitive, value-flow optimizations of programs (program analysis) , 1999 .

[12]  Thomas W. Reps,et al.  Precise Interprocedural Dataflow Analysis with Applications to Constant Propagation , 1995, TAPSOFT.

[13]  Andrew Ayers,et al.  Aggressive inlining , 1997, PLDI '97.

[14]  Etienne Morel,et al.  Global optimization by suppression of partial redundancies , 1979, CACM.

[15]  Brian N. Bershad,et al.  Fast, effective dynamic compilation , 1996, PLDI '96.

[16]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[17]  Rajiv Gupta,et al.  Path profile guided partial redundancy elimination using speculation , 1998, Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225).

[18]  Scott A. Mahlke,et al.  Sentinel scheduling for VLIW and superscalar processors , 1992, ASPLOS V.

[19]  Richard E. Hank,et al.  Region-based compilation: an introduction and motivation , 1995, MICRO 1995.

[20]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[21]  Scott Mahlke,et al.  Sentinel scheduling: a model for compiler-controlled speculative execution , 1993 .

[22]  Rajiv Gupta,et al.  Load-reuse analysis: design and evaluation , 1999, PLDI '99.

[23]  Thomas W. Reps,et al.  Demand interprocedural dataflow analysis , 1995, SIGSOFT FSE.

[24]  Manfred P. Stadel,et al.  A solution to a problem with Morel and Renvoise's “Global optimization by suppression of partial redundancies” , 1988, TOPL.

[25]  Henk Corporaal,et al.  Controlled Node Splitting , 1996, CC.

[26]  Rajiv Gupta,et al.  Interprocedural conditional branch elimination , 1997, PLDI '97.

[27]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[28]  Bernhard Steffen,et al.  Optimal code motion: theory and practice , 1994, TOPL.

[29]  R. N. Horspool,et al.  Partial redundancy elimination driven by a cost-benefit analysis , 1997, Proceedings of the Eighth Israeli Conference on Computer Systems and Software Engineering.

[30]  Barbara G. Ryder,et al.  Elimination algorithms for data flow analysis , 1986, CSUR.

[31]  Rastislav Bodík,et al.  Path-sensitive value-flow analysis , 1998, POPL '98.

[32]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[33]  Dhananjay M. Dhamdhere,et al.  How to analyze large programs efficiently and informatively , 1992, PLDI '92.

[34]  M. Schlansker,et al.  Path-sensitive Value-flow Optimizations , 1998 .

[35]  Rajiv Gupta,et al.  Partial dead code elimination using slicing transformations , 1997, PLDI '97.

[36]  Rajiv Gupta,et al.  Resource-sensitive profile-directed data flow analysis for code optimization , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.