A nanopass framework for commercial compiler development

Contemporary compilers must typically handle sophisticated high-level source languages, generate efficient code for multiple hardware architectures and operating systems, and support source-level debugging, profiling, and other program development tools. As a result, compilers tend to be among the most complex of software systems. Nanopass frameworks are designed to help manage this complexity. A nanopass compiler is comprised of many single-task passes with formally defined intermediate languages. The perceived downside of a nanopass compiler is that the extra passes will lead to substantially longer compilation times. To determine whether this is the case, we have created a plug replacement for the commercial Chez Scheme compiler, implemented using an updated nanopass framework, and we have compared the speed of the new compiler and the code it generates against the original compiler for a large set of benchmark programs. This paper describes the updated nanopass framework, the new compiler, and the results of our experiments. The compiler produces faster code than the original, averaging 15-27% depending on architecture and optimization level, due to a more sophisticated but slower register allocator and improvements to several optimizations. Compilation times average well within a factor of two of the original compiler, despite the slower register allocator and the replacement of five passes of the original 10 with over 50 nanopasses.

[1]  R. Kent Dybvig,et al.  Representing control in the presence of one-shot continuations , 1996, PLDI '96.

[2]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[3]  Paul Hudak,et al.  ORBIT: an optimizing compiler for scheme , 1986, SIGPLAN '86.

[4]  R. Kent Dybvig,et al.  A sufficiently smart compiler for procedural records , 2012, Scheme '12.

[5]  R. K. Dybvig Three implementation models for scheme , 1987 .

[6]  R. Kent Dybvig,et al.  An infrastructure for profile-driven dynamic recompilation , 1998, Proceedings of the 1998 International Conference on Computer Languages (Cat. No.98CB36225).

[7]  Robert Hieb,et al.  Representing control in the presence of first-class continuations , 1990, PLDI '90.

[8]  Walid Taha,et al.  MetaML and multi-stage programming with explicit annotations , 2000, Theor. Comput. Sci..

[9]  Jr. Guy L. Steele,et al.  Rabbit: A Compiler for Scheme , 1978 .

[10]  R. Kent Dybvig,et al.  Fixing Letrec: A Faithful Yet Efficient Implementation of Scheme's Recursive Binding Construct , 2005, High. Order Symb. Comput..

[11]  Sorin Lerner,et al.  Automated soundness proofs for dataflow analyses and transformations via local rules , 2005, POPL '05.

[12]  Sorin Lerner,et al.  Automatic inference of optimizer flow functions from semantic meanings , 2007, PLDI '07.

[13]  Keith D. Cooper,et al.  Improvements to graph coloring register allocation , 1994, TOPL.

[14]  Robert E. Tarjan,et al.  Depth-First Search and Linear Graph Algorithms , 1972, SIAM J. Comput..

[15]  Andrew W. Appel,et al.  Efficient and safe-for-space closure conversion , 2000, TOPL.

[16]  Olin Shivers,et al.  Control-flow analysis of higher-order languages of taming lambda , 1991 .

[17]  R. Kent Dybvig,et al.  Printing floating-point numbers quickly and accurately , 1996, PLDI '96.

[18]  Perry Alexander,et al.  A pattern for almost homomorphic functions , 2012, WGP '12.

[19]  R. Kent Dybvig,et al.  Revised5 Report on the Algorithmic Language Scheme , 1986, SIGP.

[20]  R. Kent Dybvig,et al.  Expansion-passing style: A general macro mechanism , 1988, LISP Symb. Comput..

[21]  Mitchell Wand,et al.  Lightweight closure conversion , 1997, TOPL.

[22]  Laurie J. Hendren,et al.  SableCC, an object-oriented compiler framework , 1998, Proceedings. Technology of Object-Oriented Languages. TOOLS 26 (Cat. No.98EX176).

[23]  Robert Hieb,et al.  Destination-Driven Code Generation , 1990 .

[24]  Richard W. Vuduc,et al.  POET: Parameterized Optimizations for Empirical Tuning , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[25]  R. Kent Dybvig,et al.  The Scheme Programming Language , 1995 .

[26]  R. Kent Dybvig,et al.  Optimizing closures in O(0) time , 2012, Scheme '12.

[27]  R. Kent Dybvig,et al.  Implicit phasing for library dependencies , 2008 .

[28]  Joel S. Cohen,et al.  Computer Algebra and Symbolic Computation: Elementary Algorithms , 2002 .

[29]  Jeffrey Mark Siskind,et al.  Flow-Directed Lightweight Closure Conversion , 2000 .

[30]  Magne Haveraaen,et al.  Design of the CodeBoost transformation system for domain-specific optimisation of C++ programs , 2003, Proceedings Third IEEE International Workshop on Source Code Analysis and Manipulation.

[31]  R. Kent Dybvig,et al.  Flow-sensitive type recovery in linear-log time , 2011, OOPSLA '11.

[32]  R. Kent Dybvig,et al.  Writing Hygienic Macros in Scheme with Syntax-Case , 1992 .

[33]  Philip Wadler,et al.  Deforestation: Transforming Programs to Eliminate Trees , 1990, Theor. Comput. Sci..

[34]  R. Kent Dybvig,et al.  Guardians in a generation-based garbage collector , 1993, PLDI '93.

[35]  Richard P. Gabriel,et al.  Performance and evaluation of Lisp systems , 1985 .

[36]  Andrew W. Appel,et al.  Compiling with Continuations , 1991 .

[37]  R. Kent Dybvig,et al.  Compiler Construction Using Scheme , 1995, FPLE.

[38]  Andrew Lumsdaine,et al.  A language for specifying compiler optimizations for generic software , 2007 .

[39]  Andrew Shalit,et al.  The Dylan Reference Manual: The Definitive Guide to the New Object-Oriented Dynamic Language , 1996 .

[40]  R. Kent Dybvig,et al.  Efficient nondestructive equality checking for trees and graphs , 2008, ICFP.

[41]  Joel S. Cohen,et al.  Computer Algebra and Symbolic Computation: Mathematical Methods , 2003 .

[42]  Matthew Flatt Composable and compilable macros:: you want it when? , 2002, ICFP '02.

[43]  Harold Abelson,et al.  Revised5 report on the algorithmic language scheme , 1998, SIGP.

[44]  Dominique Boucher GOld: a link-time optimizer for Scheme , 2000 .

[45]  David Abrahams,et al.  C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond (C++ In-Depth Series) , 2004 .

[46]  Robert Hieb,et al.  Engines From Continuations , 1989, Comput. Lang..

[47]  Maarten M. Fokkinga,et al.  Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire , 1991, FPCA.

[48]  Olin Shivers,et al.  Control flow analysis in scheme , 1988, PLDI '88.

[49]  Link-Time Optimization in GCC: Requirements and High-Level Design , 2005 .

[50]  R. Kent Dybvig,et al.  Fast and Effective Procedure Inlining , 1997, SAS.

[51]  R. Kent Dybvig,et al.  Don't Stop the BIBOP: Flexible and Ecient Storage Management for Dynamically Typed Languages , 1994 .

[52]  Luca Cardelli The Functional Abstract Machine , 1983 .

[53]  C. van Reeukwijk Tm: a Code Generator for Recursive Data Structures , 1992, Softw. Pract. Exp..

[54]  Robert Hieb,et al.  Syntactic abstraction in scheme , 1992, LISP Symb. Comput..

[55]  R. Kent Dybvig,et al.  Revised6 Report on the Algorithmic Language Scheme , 2009 .

[56]  Thomas Johnsson,et al.  Lambda Lifting: Treansforming Programs to Recursive Equations , 1985, FPCA.

[57]  Abdulaziz Ghuloum An Incremental Approach to Compiler Construction , 2006 .

[58]  Eelco Visser,et al.  Stratego/XT 0.17. A language and toolset for program transformation , 2008, Sci. Comput. Program..

[59]  R. Kent Dybvig,et al.  Generation-Friendly Eq Hash Tables , 2007 .

[60]  R. Kent Dybvig,et al.  Register allocation using lazy saves, eager restores, and greedy shuffling , 1995, PLDI '95.

[61]  R. Kent Dybvig,et al.  A nanopass infrastructure for compiler education , 2004, ICFP '04.

[62]  Simon L. Peyton Jones,et al.  Template meta-programming for Haskell , 2002, Haskell '02.

[63]  R. Kent Dybvig The Scheme Programming Language, 4th Edition , 2009 .

[64]  R. Kent Dybvig,et al.  An efficient implementation of multiple return values in Scheme , 1994, LFP '94.

[65]  Stephen Weeks,et al.  Whole-program compilation in MLton , 2006, ML '06.

[66]  Bronis R. de Supinski,et al.  Semantic-Driven Parallelization of Loops Operating on User-Defined Containers , 2003, LCPC.

[67]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[68]  P. Tucker Withington,et al.  Dylan programming: an object-oriented and dynamic language , 1996 .

[69]  Pascal Fradet,et al.  Compilation of functional languages by program transformation , 1991, TOPL.

[70]  R. Kent Dybvig,et al.  Extending the scope of syntactic abstraction , 1999, POPL '99.

[71]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[72]  R. Kent Dybvig,et al.  The development of Chez Scheme , 2006, ICFP '06.

[73]  Daniel P. Friedman,et al.  Abstracting Timed Preemption with Engines , 1987, Comput. Lang..

[74]  C. Van Reeuwijk,et al.  Tm: a code generator for recursive data structures , 1992 .

[75]  R. Kent Dybvig,et al.  Implicit phasing for R6RS libraries , 2007, ICFP '07.

[76]  Manuel Serrano Control flow analysis: a functional languages compilation paradigm , 1995, SAC '95.

[77]  Iulian Dragos,et al.  Optimizing Higher-Order Functions in Scala , 2008 .

[78]  R. Kent Dybvig,et al.  Nanopass compiler infrastructure , 2008 .

[79]  Robert Hieb,et al.  A new approach to procedures with variable arity , 1990, LISP Symb. Comput..

[80]  Torbjörn Ekman,et al.  The JastAdd system - modular extensible compiler construction , 2007, Sci. Comput. Program..

[81]  R. Kent Dybvig,et al.  Fixing Letrec ( reloaded ) , 2009 .