Translation validation for an optimizing compiler

We describe a translation validation infrastructure for the GNU C compiler. During the compilation the infrastructure compares the intermediate form of the program before and after each compiler pass and verifies the preservation of semantics. We discuss a general framework that the optimizer can use to communicate to the validator what transformations were performed. Our implementation however does not rely on help from the optimizer and it is quite successful by using instead a few heuristics to detect the transformations that take place. The main message of this paper is that a practical translation validation infrastructure, able to check the correctness of many of the transformations performed by a realistic compiler, can be implemented with about the effort typically required to implement one compiler pass. We demonstrate this in the context of the GNU C compiler for a number of its optimizations while compiling realistic programs such as the compiler itself or the Linux kernel. We believe that the price of such an infrastructure is small considering the qualitative increase in the ability to isolate compilation errors during compiler testing and maintenance.

[1]  J. Davenport Editor , 1960 .

[2]  John McCarthy,et al.  Correctness of a compiler for arithmetic expressions , 1966 .

[3]  F. Lockwood Morris,et al.  Advice on structuring compilers and proving them correct , 1973, POPL.

[4]  David F. Martin,et al.  An approach to compiler correctness , 1975, Reliable Software.

[5]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[6]  Thomas W. Reps,et al.  On the adequacy of program dependence graphs for representing programs , 1988, POPL '88.

[7]  Martín Abadi,et al.  Explicit substitutions , 1989, POPL '90.

[8]  Wuu Yang,et al.  A program integration algorithm that accommodates semantics-preserving transformations , 1992, SDE 4.

[9]  Mitchell Wand,et al.  Proving the correctness of storage representations , 1992, LFP '92.

[10]  Michael D. Ernst,et al.  Value dependence graphs: representation without taxation , 1994, POPL '94.

[11]  Robert Harper,et al.  TIL: a type-directed optimizing compiler for ML , 1996, PLDI '96.

[12]  Fausto Giunchiglia,et al.  A Provably Correct Embedded Verifier for the Certification of Safety Critical Software , 1997, CAV.

[13]  George C. Necula,et al.  The design and implementation of a certifying compiler , 1998, PLDI.

[14]  Dexter Kozen Efficient Code Certification , 1998 .

[15]  Amir Pnueli,et al.  Translation Validation , 1998, TACAS.

[16]  George C. Necula,et al.  Compiling with proofs , 1998 .

[17]  Dan Grossman,et al.  TALx86: A Realistic Typed Assembly Language∗ , 1999 .

[18]  M. Rinard Credible Compilation , 1999 .

[19]  M. Rinard Credible Compilers , 1999 .

[20]  Wolfgang Goerigk,et al.  Towards Rigorous Compiler Implementation Verification , 1999, Collaboration between Human and Artificial Societies.

[21]  George C. Necula,et al.  A certifying compiler for Java , 2000, PLDI '00.