Defining the undefinedness of C

We present a ``negative'' semantics of the C11 language---a semantics that does not just give meaning to correct programs, but also rejects undefined programs. We investigate undefined behavior in C and discuss the techniques and special considerations needed for formally specifying it. We have used these techniques to modify and extend a semantics of C into one that captures undefined behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end nearly doubling the size of the original semantics. From our semantics, we have automatically extracted an undefinedness checker, which we evaluate against other popular analysis tools, using our own test suite in addition to a third-party test suite. Our checker is capable of detecting examples of all 77 categories of core language undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of undefined behavior in C, and thus of the C language itself.

[1]  James Cheney,et al.  Cyclone: A Safe Dialect of C , 2002, USENIX Annual Technical Conference, General Track.

[2]  MeseguerJosé Conditional rewriting logic as a unified model of concurrency , 1992 .

[3]  Nikolaos S. Papaspyrou Denotational semantics of ANSI C , 2001, Comput. Stand. Interfaces.

[4]  Grigore Rosu,et al.  An overview of the K semantic framework , 2010, J. Log. Algebraic Methods Program..

[5]  Robbert Krebbers An operational and axiomatic semantics for non-determinism and sequence points in C , 2014, POPL.

[6]  Grigore Rosu,et al.  Runtime Verification of C Memory Safety , 2009, RV.

[7]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[8]  George C. Necula,et al.  CCured: type-safe retrofitting of legacy code , 2002, POPL '02.

[9]  Armando Solar-Lezama,et al.  Towards optimization-safe systems: analyzing the impact of undefined behavior , 2013, SOSP.

[10]  Xavier Leroy,et al.  Mechanized Semantics for the Clight Subset of the C Language , 2009, Journal of Automated Reasoning.

[11]  Patrick Cousot,et al.  The ASTREÉ Analyzer , 2005, ESOP.

[12]  Chucky Ellison,et al.  An executable formal semantics of C with applications , 2011, POPL '12.

[13]  Robert C. Seacord The CERT® C Coding Standard, Second Edition: 98 Rules for Developing Safe, Reliable, and Secure Systems , 2014 .

[14]  Brian Campbell,et al.  An Executable Semantics for CompCert C , 2012, CPP.

[15]  Michael Norrish C formalised in HOL , 1998 .

[16]  Charles McEwen Ellison,et al.  A formal semantics of C with applications , 2012 .

[17]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[18]  Robbert Krebbers,et al.  Aliasing Restrictions of C11 Formalized in Coq , 2013, CPP.

[19]  Benjamin Monate,et al.  A Value Analysis for C Programs , 2009, 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation.

[20]  Xavier Leroy,et al.  The CompCert C verified compiler: Documentation and user’s manual , 2015 .