State of the Union: Type Inference Via Craig Interpolation

The ad-hoc use of unions to encode disjoint sum types in C programs and the inability of C's type system to check the safe use of these unions is a long standing source of subtle bugs. We present a dependent type system that rigorously captures the ad-hoc protocols that programmers use to encode disjoint sums, and introduce a novel technique for automatically inferring, via Craig Interpolation, those dependent types and thus those protocols. In addition to checking the safe use of unions, the dependent type information inferred by interpolation gives programmers looking to modify or extend legacy code a precise understanding of the conditions under which some fields may safely be accessed. We present an empirical evaluation of our technique on 350KLOC of open source C code. In 80 out of 90 predicated edges (corresponding to 1472 out of 1684 union accesses), our type system is able to infer the correct dependent types. This demonstrates that our type system captures and explicates programmers' informal reasoning about unions, without requiring manual annotation or rewriting.

[1]  Robert Harper,et al.  A dependently typed assembly language , 2001, ICFP '01.

[2]  David Detlefs,et al.  Simplify: a theorem prover for program checking , 2005, JACM.

[3]  Amir Pnueli,et al.  Translation validation of optimizing compilers , 2006 .

[4]  Rajeev Alur,et al.  A Temporal Logic of Nested Calls and Returns , 2004, TACAS.

[5]  George C. Necula,et al.  CCured: type-safe retrofitting of legacy software , 2005, TOPL.

[6]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[7]  Frank Pfenning,et al.  Dependent types in practical programming , 1999, POPL '99.

[8]  James Cheney,et al.  Cyclone: A Safe Dialect of C , 2002, USENIX Annual Technical Conference, General Track.

[9]  Jeffrey S. Foster,et al.  Checking type safety of foreign function calls , 2008, ACM Trans. Program. Lang. Syst..

[10]  William Craig,et al.  Linear reasoning. A new form of the Herbrand-Gentzen theorem , 1957, Journal of Symbolic Logic.

[11]  Satish Chandra,et al.  Coping with type casts in C , 1999, ESEC/FSE-7.

[12]  Rupak Majumdar,et al.  Structural Invariants , 2006, SAS.

[13]  George C. Necula,et al.  Using Dependent Types to Certify the Safety of Assembly Code , 2005, SAS.

[14]  Alexander Aiken,et al.  Soft typing with conditional types , 1994, POPL '94.

[15]  Perdita Stevens,et al.  Modelling Recursive Calls with UML State Diagrams , 2003, FASE.

[16]  Satish Chandra,et al.  Dependent Types for Program Understanding , 2005, TACAS.

[17]  Thomas A. Henzinger,et al.  Lazy abstraction , 2002, POPL '02.

[18]  Susan Horwitz,et al.  Debugging via Run-Time Type Checking , 2001, FASE.

[19]  Natarajan Shankar,et al.  Subtypes for Specifications: Predicate Subtyping in PVS , 1998, IEEE Trans. Software Eng..

[20]  George C. Necula,et al.  CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs , 2002, CC.

[21]  Satish Chandra,et al.  Physical type checking for C , 1999, PASTE '99.

[22]  Kenneth L. McMillan,et al.  An interpolating theorem prover , 2005, Theor. Comput. Sci..

[23]  Cormac Flanagan,et al.  Hybrid type checking , 2006, POPL '06.