Exploiting virtual machine infrastructure to implement low-overhead error checking tools

Program-specific bugs are a growing problem with modern software. General bugs—language-level bugs that would be errors in any program, such as memory leaks and buffer overflows—have mostly been solved by modern programming languages and tools. However, program-specific bugs, such as violating data structure invariants, remain. In addition, modern trends such as construction of large programs, use of large standard libraries and third-party frameworks, and increasingly higher-level languages conspire to make program-specific bugs even more common in the future. Static analysis tools struggle with the size of these programs and language features such as dynamic classloading. Current dynamic analysis tools are too slow, often incurring a slowdown of 1-2 orders of magnitude, and thus can only be used in a debugging environment. In this thesis, we introduce a set of dynamic analysis tools that help programmers find program-specific bugs in their software. These dynamic analysis tools differ from previous work in that they leverage virtual machine infrastructure—subsystems such as the garbage collector, just-in-time compiler, and memory manager—to check programmer-specified properties at much lower cost. We show that the performance overheads of our tools are very low, typically under 5%, and that they are useful for finding bugs in real programs. These tools hit a sweet spot of performance overhead vs. code coverage, and we believe that they will help address the growing problem of program-specific bugs in production software.

[1]  Jason Flinn,et al.  Parallelizing security checks on commodity hardware , 2008, ASPLOS.

[2]  Kathryn S. McKinley,et al.  Reconsidering custom memory allocation , 2002, OOPSLA '02.

[3]  Michael D. Bond,et al.  Tolerating memory leaks , 2008, OOPSLA.

[4]  Sebastian Burckhardt,et al.  Concurrent programming with revisions and isolation types , 2010, OOPSLA.

[5]  Yanhong A. Liu,et al.  Efficient runtime invariant checking: a framework and case study , 2008, WODA '08.

[6]  Pascal Fradet,et al.  Shape types , 1997, POPL '97.

[7]  James R. Larus,et al.  Transactional Memory, 2nd edition , 2010, Transactional Memory.

[8]  James Noble,et al.  Ownership types for flexible alias protection , 1998, OOPSLA '98.

[9]  J. David Morgenthaler,et al.  Using FindBugs on production software , 2007, OOPSLA '07.

[10]  Barbara G. Ryder,et al.  Interprocedural modification side effect analysis with pointer aliasing , 1993, PLDI '93.

[11]  Michael D. Bond,et al.  LeakChaser: helping programmers narrow down causes of memory leaks , 2011, PLDI '11.

[12]  Matthias Felleisen,et al.  Future contracts , 2009, PPDP '09.

[13]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[14]  Christof Fetzer,et al.  Speculation for Parallelizing Runtime Checks , 2009, SSS.

[15]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[16]  Taiichi Yuasa,et al.  Real-time garbage collection on general-purpose machines , 1990, J. Syst. Softw..

[17]  Matthias Hauswirth,et al.  Low-overhead memory leak detection using adaptive statistical profiling , 2004, ASPLOS XI.

[18]  Bor-Yuh Evan Chang,et al.  Boogie: A Modular Reusable Verifier for Object-Oriented Programs , 2005, FMCO.

[19]  Leslie Lamport,et al.  On-the-fly garbage collection: an exercise in cooperation , 1975, CACM.

[20]  Sarfraz Khurshid,et al.  Starc: static analysis for efficient repair of complex data , 2007, OOPSLA.

[21]  Radu Rugina,et al.  Region-based shape analysis with tracked locations , 2005, POPL '05.

[22]  Eran Yahav,et al.  PHALANX: parallel checking of expressive heap assertions , 2010, ISMM '10.

[23]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[24]  Reinhard Wilhelm,et al.  Parametric shape analysis via 3-valued logic , 1999, POPL '99.

[25]  Erez Petrank,et al.  A generational on-the-fly garbage collector for Java , 2000, PLDI '00.

[26]  Damien Doligez,et al.  A concurrent, generational garbage collector for a multithreaded implementation of ML , 1993, POPL '93.

[27]  Sophia Drossopoulou,et al.  Multiple ownership , 2007, OOPSLA.

[28]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[29]  John McCarthy,et al.  Recursive functions of symbolic expressions and their computation by machine, Part I , 1960, Commun. ACM.

[30]  George C. Necula,et al.  Data Structure Specifications via Local Equality Axioms , 2005, CAV.

[31]  Greg Nelson,et al.  Extended static checking for Java , 2002, PLDI '02.

[32]  Kathryn S. McKinley,et al.  Cork: dynamic memory leak detection for garbage-collected languages , 2007, POPL '07.

[33]  Michael Wolf,et al.  The pauseless GC algorithm , 2005, VEE '05.

[34]  Rastislav Bodík,et al.  DITTO: automatic incrementalization of data structure invariant checks (in Java) , 2007, PLDI '07.

[35]  Nick Mitchell,et al.  The causes of bloat, the limits of health , 2007, OOPSLA.

[36]  Derek Rayside,et al.  Object ownership profiling: a technique for finding and fixing memory leaks , 2007, ASE.

[37]  Stephen McCamant,et al.  Inference and enforcement of data structure consistency specifications , 2006, ISSTA '06.

[38]  Sen Hu,et al.  Efficient system-enforced deterministic parallelism , 2010, OSDI.

[39]  Yoonsik Cheon,et al.  A Runtime Assertion Checker for the Java Modeling Language (JML) , 2003, ICSE 2003.

[40]  D. Jackson,et al.  Object models as heap invariants , 2003 .

[41]  F. Warren Burton,et al.  Smarter garbage collection with simplifiers , 2006, MSPC '06.

[42]  Neil Immerman,et al.  Descriptive Complexity , 1999, Graduate Texts in Computer Science.

[43]  Chen Ding,et al.  Fast Track: A Software System for Speculative Program Optimization , 2009, 2009 International Symposium on Code Generation and Optimization.

[44]  Edith Schonberg,et al.  Validating structural properties of nested objects , 2004, OOPSLA '04.

[45]  Kim M. Hazelwood,et al.  SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[46]  Sarfraz Khurshid,et al.  Assertion-based repair of complex data structures , 2007, ASE.

[47]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.

[48]  Mark N. Wegman,et al.  Analysis of pointers and structures , 1990, SIGP.

[49]  Laurie J. Hendren,et al.  Is it a tree, a DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C , 1996, POPL '96.

[50]  Kevin Fu,et al.  Pacemakers and Implantable Cardiac Defibrillators: Software Radio Attacks and Zero-Power Defenses , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[51]  K. Rustan M. Leino,et al.  The Spec# Programming System: An Overview , 2004, CASSIS.

[52]  Atanas Rountev,et al.  Precise memory leak detection for java software using container profiling , 2008, ICSE '08.

[53]  Håkan Grahn,et al.  Transactional memory , 2010, J. Parallel Distributed Comput..

[54]  Samuel Z. Guyer,et al.  GC assertions: using the garbage collector to check heap properties , 2008, MSPC '08.

[55]  Damien Doligez,et al.  Portable, unobtrusive garbage collection for multiprocessor systems , 1994, POPL '94.

[56]  Michael D. Bond,et al.  Bell: bit-encoding online memory leak detection , 2006, ASPLOS XII.

[57]  Nick Mitchell,et al.  LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications , 2003, ECOOP.

[58]  Viktor Kuncak,et al.  Full functional verification of linked data structures , 2008, PLDI '08.

[59]  Suresh Jagannathan,et al.  Safe futures for Java , 2005, OOPSLA '05.

[60]  Chandrasekhar Boyapati,et al.  Efficient software model checking of data structure properties , 2006, OOPSLA '06.

[61]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[62]  Xuezheng Liu,et al.  Conditional correlation analysis for safe region-based memory management , 2008, PLDI '08.

[63]  S. L. Graham,et al.  List Processing in Real Time on a Serial Computer , 1978 .

[64]  V. T. Rajan,et al.  A real-time garbage collector with low overhead and consistent utilization , 2003, POPL '03.

[65]  Vinod Ganapathy,et al.  HeapMD: identifying heap-based bugs using anomaly detection , 2006, ASPLOS XII.

[66]  Liuba Shrira,et al.  Ownership types for object encapsulation , 2003, POPL '03.

[67]  Guy L. Steele,et al.  Multiprocessing compactifying garbage collection , 1975, CACM.

[68]  Ondrej Lhoták,et al.  Points-to analysis using BDDs , 2003, PLDI '03.