Automatic generation of data-flow analyzers : a tool for building optimizers

Modern compilers generate good code by performing global optimizations. Unlike other functions of the compiler such as parsing and code generation which examine only one statement or one basic block at a time, optimizers examine large parts of a program and coordinate changes in widely separated parts of a program. Thus optimizers use more complex data structures and consume more time. To generate the best code, optimizers perform not one global transformation, but many in concert. These transformations can interact in unforeseen ways. This dissertation concerns the building of optimizers that are modular and extensible. It espouses an optimizer architecture, first proposed by Kildall, in which each phase is based on a data-flow analysis (DFA) of the program and on an optimization function that transforms the program. To support the architecture, a set of abstractions--flow values, flow functions, path simplification rules, action routines--is provided. A tool called Sharlit turns a DFA specification consisting of these abstraction into a solver for a DFA problem. At the heart of Sharlit is an algorithm called path simplification, an extension of Tarjan's fast path algorithm. Path simplification unifies several powerful DFA solution techniques. By using path simplification rules, compilers writers can construct a wide range of data-flow analyzers, from simple iterative ones, to solvers that use local analysis, interval analysis, or sparse data-flow evaluation. Sharlit frees compiler writers from the details of how these various solution techniques. The compiler writer can view the program representation as a simple flow graph in which each instruction is a node. Data structures to represent basic blocks and other regions are automatically generated. Sharlit promotes modularity by making it possible to build more complex data-flow analyzers from simpler ones. Sharlit promotes extensibility because it is easier to add new flow functions and path simplification rules that represent new kinds of instructions to existing analyzers. A complete optimizer has been built using Sharlit. Measurements showed that this optimizer can compete in code quality with commercial optimizing compilers.

[1]  Jeffrey D. Ullman,et al.  Application of lattice algebra to loop optimization , 1975, POPL '75.

[2]  Christopher W. Fraser,et al.  Code selection through object code optimization , 1984, TOPL.

[3]  Robert E. Tarjan Testing flow graph reducibility , 1973, STOC '73.

[4]  Micha Sharir,et al.  Structural Analysis: A New Approach to Flow Analysis in Optimizing Compilers , 2015 .

[5]  Mary Lou Soffa,et al.  Automatic generation of global optimizers , 1991, PLDI '91.

[6]  Richard L. Sites,et al.  Machine-independent PASCAL code optimization , 1979, SIGPLAN '79.

[7]  John H. Reif,et al.  Efficient Symbolic Analysis of Programs , 1986, J. Comput. Syst. Sci..

[8]  Steven W. K. Tjiang,et al.  Sharlit—a tool for building optimizers , 1992, PLDI '92.

[9]  Etienne Morel,et al.  Global optimization by suppression of partial redundancies , 1979, CACM.

[10]  Charles N. Fischer,et al.  Retargetable Compiler Code Generation , 1982, CSUR.

[11]  Bernhard Steffen,et al.  Lazy code motion , 1992, PLDI '92.

[12]  Martin Hopkins,et al.  An overview of the PL.8 compiler , 1982, SIGP.

[13]  Jeffrey D. Ullman,et al.  A Simple Algorithm for Global Data Flow Analysis Problems , 1975, SIAM J. Comput..

[14]  Gary A. Kildall,et al.  A unified approach to global program optimization , 1973, POPL.

[15]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[16]  Bowen Alpern,et al.  Detecting equality of variables in programs , 1988, POPL '88.

[17]  Steven W. K. Tjiang,et al.  Integrating Scalar Optimization and Parallelization , 1991, LCPC.

[18]  Douglas Richard Grundman Graph transformations and program flow analysis , 1991 .

[19]  N. S. Barnett,et al.  Private communication , 1969 .

[20]  G. A. Venkatesh A framework for construction and evaluation of high-level specifications for program analysis techniques , 1989, PLDI '89.

[21]  Barbara G. Ryder Incremental data flow analysis , 1983, POPL '83.

[22]  Jeffrey D. Ullman,et al.  Global Data Flow Analysis and Iterative Algorithms , 1976, J. ACM.

[23]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[24]  Christopher W. Fraser,et al.  Integrating code generation and optimization , 1986, SIGPLAN '86.

[25]  Ron Cytron,et al.  Code motion of control structures in high-level languages , 1986, POPL '86.

[26]  John H. Reif Code Motion , 1980, SIAM J. Comput..

[27]  Steve Johnson,et al.  Compiling C for vectorization, parallelization, and inline expansion , 1988, PLDI '88.

[28]  Dhananjay M. Dhamdhere,et al.  A composite hoisting-strength reduction transformation for global program optimization part ii , 1982 .

[29]  David Alex Lamb Sharing intermediate representations: the interface description language , 1983 .

[30]  Mahadevan Ganapathi Retargetable code generation and optimization using attribute grammars , 1980 .

[31]  Matthew S. Hecht,et al.  Flow Analysis of Computer Programs , 1977 .

[32]  Jong-Deok Choi,et al.  Automatic construction of sparse data flow evaluation graphs , 1991, POPL '91.

[33]  Mark Scott Johnson,et al.  Effectiveness of a machine-level, global optimizer , 1986, SIGPLAN '86.

[34]  Mark N. Wegman,et al.  An efficient method of computing static single assignment form , 1989, POPL '89.

[35]  Manuel E. Benitez,et al.  A portable global optimizer and linker , 1988, PLDI '88.

[36]  Utpal Banerjee,et al.  Dependence analysis for supercomputing , 1988, The Kluwer international series in engineering and computer science.

[37]  John L. Hennessy,et al.  Register allocation by priority-based coloring , 1984, SIGPLAN '84.

[38]  John H. Reif,et al.  Symbolic evaluation and the global value graph , 1977, POPL.

[39]  Patricia Anklam,et al.  Engineering a Compiler: Vax-11 Code Generation and Optimization , 1982 .

[40]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[41]  William H. Harrison A New Strategy for Code Generation - the General-Purpose Optimizing Compiler , 1979, IEEE Trans. Software Eng..

[42]  Wilf R. LaLonde,et al.  A flexible compiler structure that allows dynamic phase ordering , 1982, SIGPLAN '82.

[43]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.

[44]  R. Tennant Algebra , 1941, Nature.

[45]  Louise Trevillyan,et al.  The Experimental Compiling System , 1980, IBM J. Res. Dev..

[46]  Thomas W. Reps,et al.  Generating Language-Based Environments , 1982 .

[47]  Barbara G. Ryder,et al.  Elimination algorithms for data flow analysis , 1986, CSUR.

[48]  Deborah S. Coutant,et al.  Compilers for the New Generation of Hewlett-Packard Computers , 1986, COMPCON.

[49]  Deborah S. Coutant Retargetable high-level alias analysis , 1986, POPL '86.

[50]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[51]  Karen Lee Pieper Parallelizing compilers: implementation and effectiveness , 1993 .

[52]  Barbara G. Ryder,et al.  Incremental data flow analysis via dominator and attribute update , 1988, POPL '88.

[53]  Mahadevan Ganapathi,et al.  Attributed linear intermediate representations for retargetable code generators , 1984, Softw. Pract. Exp..

[54]  Charles N. Fischer,et al.  Description-driven code generation using attribute grammars , 1982, POPL '82.

[55]  Neil D. Jones,et al.  Program Flow Analysis: Theory and Application , 1981 .

[56]  Robert E. Tarjan,et al.  A fast algorithm for finding dominators in a flowgraph , 1979, TOPL.

[57]  Mark N. Wegman,et al.  Constant propagation with conditional branches , 1985, POPL.

[58]  Monica Sin-Ling Lam,et al.  A Systolic Array Optimizing Compiler , 1989 .

[59]  Mark N. Wegman,et al.  A Fast and Usually Linear Algorithm for Global Flow Analysis , 1976, J. ACM.

[60]  Howard Z. Marshall The linear graph package, a compiler building environment , 1982, SIGPLAN '82.

[61]  K. J. Ottenstein,et al.  Data-flow graphs as an intermediate program form. , 1978 .

[62]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[63]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[64]  Robert E. Tarjan,et al.  A Unified Approach to Path Problems , 1981, JACM.

[65]  Barbara G. Ryder,et al.  A Critical Analysis of Incremental Iterative Data Flow Analysis Algorithms , 1990, IEEE Trans. Software Eng..

[66]  Stephen C. Johnson A portable compiler: theory and practice , 1978, POPL.

[67]  Charles Donald Farnum,et al.  Pattern-based languages for prototyping of compiler optimizers , 1990 .

[68]  E. Schmidt,et al.  Lex—a lexical analyzer generator , 1990 .

[69]  M. Wegman,et al.  Global value numbers and redundant computations , 1988, POPL '88.

[70]  Robert E. Tarjan,et al.  Fast Algorithms for Solving Path Problems , 1981, JACM.

[71]  Arthur B. Maccabe,et al.  The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages , 1990, PLDI '90.

[72]  Alfred V. Aho,et al.  Code generation using tree matching and dynamic programming , 1989, ACM Trans. Program. Lang. Syst..

[73]  Robert Steven Glanville,et al.  A Machine Independent Algorithm for Code Generation and Its Use in Retargetable Compilers , 1977 .

[74]  Williams Ludwell Harrison,et al.  Automatic recognition of induction variables and recurrence relations by abstract interpretation , 1990, PLDI '90.

[75]  Fred C. Chow,et al.  A portable machine-independent global optimizer--design and measurements , 1984 .

[76]  Robert E. Tarjan,et al.  Finding Dominators in Directed Graphs , 1974, SIAM J. Comput..

[77]  Frances E. Allen,et al.  A Basis for Program Optimization , 1971, IFIP Congress.