A nanopass infrastructure for compiler education

Compilers structured as a small number of monolithic passes are difficult to understand and difficult to maintain. Adding new optimizations often requires major restructuring of existing passes that cannot be understood in isolation. The steep learning curve is daunting, and even experienced developers find it hard to modify existing passes without introducing subtle and tenacious bugs. These problems are especially frustrating when the developer is a student in a compiler class.An attractive alternative is to structure a compiler as a collection of many small passes, each of which performs a single task. This "micropass" structure aligns the actual implementation of a compiler with its logical organization, simplifying development, testing, and debugging. Unfortunately, writing many small passes duplicates code for traversing and rewriting abstract syntax trees and can obscure the meaningful transformations performed by individual passes.To address these problems, we have developed a methodology and associated tools that simplify the task of building compilers composed of many fine-grained passes. We describe these compilers as "nanopass" compilers to indicate both the intended granularity of the passes and the amount of source code required to implement each pass. This paper describes the methodology and tools comprising the nanopass framework.

[1]  Philip Wadler,et al.  Deforestation: Transforming Programs to Eliminate Trees , 1988, Theoretical Computer Science.

[2]  Mark N. Wegman,et al.  Constant propagation with conditional branches , 1985, POPL.

[3]  C. van Reeukwijk Tm: a Code Generator for Recursive Data Structures , 1992, Softw. Pract. Exp..

[4]  Robert Hieb,et al.  Syntactic abstraction in scheme , 1992, LISP Symb. Comput..

[5]  Wilf R. LaLonde,et al.  A flexible compiler structure that allows dynamic phase ordering , 1982, SIGPLAN '82.

[6]  Andrew W. Appel,et al.  The Zephyr Abstract Syntax Description Language , 1997, DSL.

[7]  Sorin Lerner,et al.  Composing dataflow analyses and transformations , 2002, POPL '02.

[8]  Michael R. Clarkson,et al.  Polyglot: An Extensible Compiler Framework for Java , 2003, CC.

[9]  Peter Lee,et al.  TIL: a type-directed, optimizing compiler for ML , 2004, SIGP.

[10]  Keith D. Cooper,et al.  Combining analyses, combining optimizations , 1995, TOPL.

[11]  Mary Lou Soffa,et al.  An approach to ordering optimizing transformations , 1990, PPOPP '90.

[12]  R. Kent Dybvig,et al.  The Scheme Programming Language , 1995 .

[13]  Ken Kennedy,et al.  PFC: A Program to Convert Fortran to Parallel Form , 1982 .

[14]  Michael Hind,et al.  Combining Interprocedural Pointer Analysis and Conditional Constant Propagation , 1999 .

[15]  R. Kent Dybvig,et al.  Fast and Effective Procedure Inlining , 1997, SAS.

[16]  Kees van Reeuwijk,et al.  Rapid and Robust Compiler Construction Using Template-Based Metacompilation , 2003, CC.