Nanopass compiler infrastructure

A compiler that is structured as a small number of monolithic passes is difficult to understand and difficult to maintain. Finding compiler bugs in such a compiler is often difficult, and adding new optimizations and analyses sometimes requires major restructuring that can result in subtle and tenacious bugs. A natural solution to this problem is to structure the compiler as a series of correctness-preserving transformations, each of which performs a small part of the compilation process. Unfortunately, as each pass of the compiler becomes simpler, the number of passes required to accomplish the entire task becomes greater. Even though a pass may make significant changes to only a few of the intermediate-language forms, it must still handle the remaining forms. This coding overhead can more than offset the benefits of having a fine-grained structure, as the sheer volume of essentially repetitive code can obscure the meaningful transformations performed by the compiler. This dissertation describes a new “nanopass” infrastructure that eliminates most of the repetitive coding overhead, making the development of such compilers simpler, less tedious, and less error-prone. The nanopass infrastructure formalizes the declaration of intermediate languages, and a compiler written using the infrastructure rejects intermediate-language programs that are not well-formed according to these declarations, increasing reliability. The infrastructure also represents intermediate-language code internally using efficient lowlevel data structures while allowing the compiler writer to interact with the code at a higher, more readable level. The infrastructure provides additional tools to support rapid development of new compilers, as well as experimentation with new language features and code improvement strategies in existing compilers. The modular nature of the compilers developed with the infrastructure should also support rapid adaptation of general-purpose compilers to domain-specific purposes.

[1]  Laurie J. Hendren,et al.  SableCC, an object-oriented compiler framework , 1998, Proceedings. Technology of Object-Oriented Languages. TOOLS 26 (Cat. No.98EX176).

[2]  Mary Lou Soffa,et al.  An approach to ordering optimizing transformations , 1990, PPOPP '90.

[3]  Jeffrey S. Rohl,et al.  The compiler compiler , 1963 .

[4]  Bruce R. Schatz,et al.  An Overview of the Production-Quality Compiler-Compiler Project , 1980, Computer.

[5]  R. Kent Dybvig,et al.  The Scheme Programming Language , 1995 .

[6]  John Boyland,et al.  Descriptional Composition of Compiler Components , 1996 .

[7]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[8]  R. Kent Dybvig,et al.  Fast and Effective Procedure Inlining , 1997, SAS.

[9]  Russell W. Quong,et al.  ANTLR: A predicated‐LL(k) parser generator , 1995, Softw. Pract. Exp..

[10]  C. A. R. Hoare The Verifying Compiler, a Grand Challenge for Computing Research , 2005, VMCAI.

[11]  Ken Kennedy,et al.  PFC: A Program to Convert Fortran to Parallel Form , 1982 .

[12]  R. Kent Dybvig,et al.  Writing Hygienic Macros in Scheme with Syntax-Case , 1992 .

[13]  Murray Hill,et al.  Yacc: Yet Another Compiler-Compiler , 1978 .

[14]  Eelco Visser,et al.  Building program optimizers with rewriting strategies , 1998, ICFP '98.

[15]  Philip Wadler,et al.  Deforestation: Transforming Programs to Eliminate Trees , 1990, Theor. Comput. Sci..

[16]  Oege de Moor,et al.  Imperative Program Transformation by Rewriting , 2001, CC.

[17]  Wilf R. LaLonde,et al.  A flexible compiler structure that allows dynamic phase ordering , 1982, SIGPLAN '82.

[18]  James R. Larus,et al.  Righting software , 2004, IEEE Software.

[19]  Paul Hudak,et al.  Realistic Compilation by Program Transformation. , 1989 .

[20]  Andrew W. Appel,et al.  The Zephyr Abstract Syntax Description Language , 1997, DSL.

[21]  H. Tirri,et al.  ALCHEMIST/spl minus/an object-oriented tool to build transformations between heterogeneous data representations , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[22]  Michael R. Clarkson,et al.  Polyglot: An Extensible Compiler Framework for Java , 2003, CC.

[23]  C. Van Reeuwijk,et al.  Tm: a code generator for recursive data structures , 1992 .

[24]  Craig Chambers,et al.  Towards better inlining decisions using inlining trials , 1994, LFP '94.

[25]  Hanspeter Moessenboeck,et al.  Coco/R - A Generator for Fast Compiler Front Ends , 1990 .

[26]  Sorin Lerner,et al.  Composing dataflow analyses and transformations , 2002, POPL '02.

[27]  Robert Harper,et al.  TIL: a type-directed optimizing compiler for ML , 1996, PLDI '96.

[28]  Jr. Guy L. Steele,et al.  Rabbit: A Compiler for Scheme , 1978 .

[29]  Robert Hieb,et al.  Syntactic abstraction in scheme , 1992, LISP Symb. Comput..

[30]  Daniel P. Friedman,et al.  Scheme and the art of programming , 1983 .

[31]  Mitchell Wand,et al.  Essentials of programming languages , 2008 .

[32]  S. Jones,et al.  A Transformation-Based Optimiser for Haskell , 1998, Sci. Comput. Program..

[33]  Keith D. Cooper,et al.  Combining analyses, combining optimizations , 1995, TOPL.

[34]  Kees van Reeuwijk,et al.  Rapid and Robust Compiler Construction Using Template-Based Metacompilation , 2003, CC.

[35]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[36]  Robert Giegerich,et al.  A truly generative semantics-directed compiler generator , 1982, SIGPLAN '82.

[37]  Neil D. Jones Semantics-Directed Compiler Generation , 1980, Lecture Notes in Computer Science.