Finding and understanding bugs in C compilers

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong-code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

[1]  Alan J. Hu,et al.  Cutpoints for formal equivalence verification of embedded software , 2005, EMSOFT.

[2]  Sorin Lerner,et al.  Bringing extensibility to verified compilers , 2010, PLDI '10.

[3]  Steve Summit C Programming FAQs: Frequently Asked Questions , 1995 .

[4]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[5]  C. J. Burgess,et al.  The automatic generation of test cases for optimizing Fortran compilers , 1996, Inf. Softw. Technol..

[6]  Virginie Wiels,et al.  Formal Verification of Avionics Software Products , 2009, FM.

[7]  Adam Kiezun,et al.  Grammar-based whitebox fuzzing , 2008, PLDI '08.

[8]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[9]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[10]  Flash Sheridan Practical testing of a C99 compiler using output comparison , 2007 .

[11]  Christian Lindig,et al.  Random testing of C calling conventions , 2005, AADEBUG'05.

[12]  Richard L. Sauder,et al.  A general test data generator for COBOL , 1962, AIEE-IRE '62 (Spring).

[13]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[14]  Michael Norrish,et al.  seL4: formal verification of an OS kernel , 2009, SOSP '09.

[15]  Paul Walton Purdom,et al.  A sentence generator for testing parsers , 1972 .

[16]  Flash Sheridan,et al.  Practical testing of a C99 compiler using output comparison , 2007, Softw. Pract. Exp..

[17]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[18]  Michael Wolfe,et al.  How compilers and tools differ for embedded systems , 2005, CASES '05.

[19]  Zhendong Su,et al.  HDD: hierarchical delta debugging , 2006, ICSE.

[20]  Liang Guo,et al.  Automated test program generation for an industrial optimizing compiler , 2009, 2009 ICSE Workshop on Automation of Software Test.

[21]  Eric Eide,et al.  Volatiles are miscompiled, and what to do about it , 2008, EMSOFT '08.

[22]  Abdulazeez S. Boujarwah,et al.  Compiler test case generation methods: a survey and assessment , 1997, Inf. Softw. Technol..

[23]  K. V. Hanford,et al.  Automatic Generation of Test Cases , 1970, IBM Syst. J..

[24]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..