LeakChaser: helping programmers narrow down causes of memory leaks

In large programs written in managed languages such as Java and C#, holding unnecessary references often results in memory leaks and bloat, degrading significantly their run-time performance and scalability. Despite the existence of many leak detectors for such languages, these detectors often target low-level objects; as a result, their reports contain many false warnings and lack sufficient semantic information to help diagnose problems. This paper introduces a specification-based technique called LeakChaser that can not only capture precisely the unnecessary references leading to leaks, but also explain, with high-level semantics, why these references become unnecessary. At the heart of LeakChaser is a three-tier approach that uses varying levels of abstraction to assist programmers with different skill levels and code familiarity to find leaks. At the highest tier of the approach, the programmer only needs to specify the boundaries of coarse-grained activities, referred to as transactions. The tool automatically infers liveness properties of these transactions, by monitoring the execution, in order to find unnecessary references. Diagnosis at this tier can be performed by any programmer after inspecting the APIs and basic modules of a program, without understanding of the detailed implementation of these APIs. At the middle tier, the programmer can introduce application-specific semantic information by specifying properties for the transactions. At the lowest tier of the approach is a liveness checker that does not rely on higher-level semantic information, but rather allows a programmer to assert lifetime relationships for pairs of objects. This task could only be performed by skillful programmers who have a clear understanding of data structures and algorithms in the program. We have implemented LeakChaser in Jikes RVM and used it to help us diagnose several real-world leaks. The implementation incurs a reasonable overhead for debugging and tuning. Our case studies indicate that the implementation is powerful in guiding programmers with varying code familiarity to find the root causes of several memory leaks---even someone who had not studied a leaking program can quickly find the cause after using LeakChaser's iterative process that infers and checks properties with different levels of semantic information.

[1]  Nick Mitchell,et al.  The Runtime Structure of Object Ownership , 2006, ECOOP.

[2]  Koushik Sen,et al.  Asserting and checking determinism for multithreaded programs , 2009, ESEC/FSE '09.

[3]  Emery D. Berger,et al.  Efficiently and precisely locating memory leaks and bloat , 2009, PLDI '09.

[4]  Edith Schonberg,et al.  Finding low-utility data structures , 2010, PLDI '10.

[5]  Eran Yahav,et al.  Chameleon: adaptive selection of collections , 2009, PLDI '09.

[6]  Erik R. Altman,et al.  Performance analysis of idle programs , 2010, OOPSLA.

[7]  Matthew Arnold,et al.  Jolt: lightweight dynamic analysis and removal of object churn , 2008, OOPSLA.

[8]  Edith Schonberg,et al.  Making Sense of Large Heaps , 2009, ECOOP.

[9]  Neil Immerman,et al.  What can the GC compute efficiently?: a language for heap assertions at GC time , 2010, SPLASH 2010.

[10]  Matthew Arnold,et al.  Software bloat analysis: finding, removing, and preventing performance problems in modern large-scale object-oriented applications , 2010, FoSER '10.

[11]  Kathryn S. McKinley,et al.  Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance , 2008, PLDI '08.

[12]  Atanas Rountev,et al.  Precise memory leak detection for java software using container profiling , 2013, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[13]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[14]  Eran Yahav,et al.  PHALANX: parallel checking of expressive heap assertions , 2010, ISMM '10.

[15]  Koushik Sen,et al.  DETERMIN: inferring likely deterministic specifications of multithreaded programs , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[16]  Michael D. Bond,et al.  Tolerating memory leaks , 2008, OOPSLA.

[17]  Atanas Rountev,et al.  Detecting inefficiently-used containers to avoid bloat , 2010, PLDI '10.

[18]  Nick Mitchell,et al.  The causes of bloat, the limits of health , 2007, OOPSLA.

[19]  Nick Mitchell,et al.  LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications , 2003, ECOOP.

[20]  Kathryn S. McKinley,et al.  Generating object lifetime traces with Merlin , 2006, TOPL.

[21]  Nick Mitchell,et al.  Modeling Runtime Behavior in Framework-Based Applications , 2006, ECOOP.

[22]  Edith Schonberg,et al.  Four Trends Leading to Java Runtime Bloat , 2010, IEEE Software.

[23]  Matthias Hauswirth,et al.  Low-overhead memory leak detection using adaptive statistical profiling , 2004, ASPLOS XI.

[24]  Samuel Z. Guyer,et al.  GC assertions: using the garbage collector to check heap properties , 2008, MSPC '08.

[25]  Michael D. Bond,et al.  Bell: bit-encoding online memory leak detection , 2006, ASPLOS XII.

[26]  Michael D. Bond,et al.  Leak pruning , 2009, ASPLOS.

[27]  Qi Gao,et al.  LeakSurvivor: Towards Safely Tolerating Memory Leaks for Garbage-Collected Languages , 2008, USENIX Annual Technical Conference.

[28]  Matthew Arnold,et al.  Go with the flow: profiling copies to find runtime bloat , 2009, PLDI '09.

[29]  Derek Rayside,et al.  Object ownership profiling: a technique for finding and fixing memory leaks , 2007, ASE.

[30]  Kathryn S. McKinley,et al.  Cork: dynamic memory leak detection for garbage-collected languages , 2007, POPL '07.

[31]  Alessandro Orso,et al.  LEAKPOINT: pinpointing the causes of memory leaks , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[32]  Barbara G. Ryder,et al.  A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications , 2008, SIGSOFT '08/FSE-16.