Enforcing Alias Analysis for Weakly Typed Languages

Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. Optimizing compilers can produce unpredictable results when such errors occur, but this is quite undesirable for many tools that aim to analyze security and reliability properties with guarantees of soundness. We describe a compilation strategy for standard C programs that guarantees sound semantics for an aggressive interprocedural pointer analysis (or simpler ones), a call graph, and type information for a subset of memory. These provide the foundation for sophisticated static analyses to be applied to such programs with a guarantee of soundness. Our work builds on a previously published transformation called Automatic Pool Allocation to ensure that hard-to-detect memory errors (dangling pointer references and certain array bounds errors) cannot invalidate the call graph, points-to information or type information. The key insights behind our approach is that pool allocation can be used to create a run-time partitioning of memory that matches the compile-time memory partitioning in a points-to graph, and efficient checks can be used to isolate the run-time partitions. Furthermore, we show that the sound analysis information enables static checking techniques that reliably eliminate many run-time checks. We formalize our approach as a new type system with the necessary run-time checks in operational semantics and prove the correctness of our approach for a subset of C. Our approach requires no source code changes, allows memory to be managed explicitly, and does not use meta-data on pointers or individual tag bits for memory. Using several benchmarks and system codes, we show experimentally that the run-time overheads are low (less than 10% in nearly all cases and 30% in the worst case we have seen). We also show the effectiveness of reliable static analyses in eliminating run-time checks.

[1]  Martin C. Rinard,et al.  Ownership types for safe region-based memory management in real-time Java , 2003, PLDI '03.

[2]  James Gosling,et al.  The Real-Time Specification for Java , 2000, Computer.

[3]  Emery D. Berger,et al.  DieHard: probabilistic memory safety for unsafe languages , 2006, PLDI '06.

[4]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[5]  Dinakar Dhurjati,et al.  Memory safety without runtime checks or garbage collection , 2003 .

[6]  Paul H. J. Kelly,et al.  Backwards-Compatible Bounds Checking for Arrays and Pointers in C Programs , 1997, AADEBUG.

[7]  Kathryn S. McKinley,et al.  Reconsidering custom memory allocation , 2002, OOPSLA '02.

[8]  Michael Rodeh,et al.  CSSV: towards a realistic tool for statically detecting all buffer overflows in C , 2003, PLDI '03.

[9]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[10]  Zhe Yang,et al.  Software validation via scalable path-sensitive value flow analysis , 2004, ISSTA '04.

[11]  Harish Patil,et al.  Efficient Run-time Monitoring Using Shadow Processing , 1995, AADEBUG.

[12]  Martin Elsman,et al.  Programming with regions in the ML Kit , 1997 .

[13]  Robert Wahbe,et al.  Efficient software-based fault isolation , 1994, SOSP '93.

[14]  Susan Horwitz,et al.  Protecting C programs from attacks via invalid pointer dereferences , 2003, ESEC/FSE-11.

[15]  William Pugh,et al.  The Omega Library interface guide , 1995 .

[16]  Somesh Jha,et al.  Buffer overrun detection using linear programming and static analysis , 2003, CCS '03.

[17]  Mads Tofte,et al.  Region-based Memory Management , 1997, Inf. Comput..

[18]  Robert O. Hastings,et al.  Fast detection of memory leaks and access errors , 1991 .

[19]  Dinakar Dhurjati,et al.  Backwards-compatible array bounds checking for C with very low overhead , 2006, ICSE.

[20]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[21]  Alexander Aiken,et al.  Better static memory management: improving region-based analysis of higher-order languages , 1995, PLDI '95.

[22]  Thomas A. Henzinger,et al.  Software Verification with BLAST , 2003, SPIN.

[23]  Radu Rugina,et al.  Region-based shape analysis with tracked locations , 2005, POPL '05.

[24]  Vikram S. Adve,et al.  Automatic pool allocation: improving performance by controlling data structure layout in the heap , 2005, PLDI '05.

[25]  Todd M. Austin,et al.  Efficient detection of all pointer and array access errors , 1994, PLDI '94.

[26]  Vikram S. Adve,et al.  Macroscopic Data Structure Analysis and Optimization , 2005 .

[27]  Sorin Lerner,et al.  ESP: path-sensitive program verification in polynomial time , 2002, PLDI '02.

[28]  Dan Grossman,et al.  Experience with safe manual memory-management in cyclone , 2004, ISMM '04.

[29]  George C. Necula,et al.  CCured: type-safe retrofitting of legacy software , 2005, TOPL.

[30]  Dawson R. Engler,et al.  ARCHER: using symbolic, path-sensitive analysis to detect memory access errors , 2003, ESEC/FSE-11.

[31]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[32]  James Cheney,et al.  Region-based memory management in cyclone , 2002, PLDI '02.

[33]  Martin C. Carlisle,et al.  Olden: parallelizing programs with dynamic data structures on distributed-memory machines , 1996 .

[34]  Shengchao Qin,et al.  Region inference for an object-oriented language , 2004, PLDI '04.

[35]  Zhe Yang,et al.  Modular checking for buffer overflows in the large , 2006, ICSE.

[36]  David Gay,et al.  Memory management with explicit regions , 1998, PLDI.

[37]  Thomas W. Reps,et al.  Debugging via Run-Time Type Checking , 2001, FASE.

[38]  Dinakar Dhurjati,et al.  Memory safety without garbage collection for embedded applications , 2005, TECS.

[39]  Dinakar Dhurjati,et al.  Efficiently Detecting All Dangling Pointer Uses in Production Servers , 2006, International Conference on Dependable Systems and Networks (DSN'06).