Diagnosing memory leaks using graph mining on heap dumps

Memory leaks are caused by software programs that prevent the reclamation of memory that is no longer in use. They can cause significant slowdowns, exhaustion of available storage space and, eventually, application crashes. Detecting memory leaks is challenging because real-world applications are built on multiple layers of software frameworks, making it difficult for a developer to know whether observed references to objects are legitimate or the cause of a leak. We present a graph mining solution to this problem wherein we analyze heap dumps to automatically identify subgraphs which could represent potential memory leak sources. Although heap dumps are commonly analyzed in existing heap profiling tools, our work is the first to apply a graph grammar mining solution to this problem. Unlike classical graph mining work, we show that it suffices to mine the dominator tree of the heap dump, which is significantly smaller than the underlying graph. Our approach identifies not just leaking candidates and their structure, but also provides aggregate information about the access path to the leaks. We demonstrate several synthetic as well as real-world examples of heap dumps for which our approach provides more insight into the problem than state-of-the-art tools such as Eclipse's MAT.

[1]  Sigmund Cherem,et al.  Practical memory leak detection using guarded value-flow analysis , 2007, PLDI '07.

[2]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[3]  Guru Venkataramani,et al.  MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[4]  Hong Cheng,et al.  Identifying bug signatures using discriminative graph mining , 2009, ISSTA.

[5]  Joost Engelfriet,et al.  Graph Grammars Based on Node Rewriting: An Introduction to NLC Graph Grammars , 1990, Graph-Grammars and Their Application to Computer Science.

[6]  Monica S. Lam,et al.  A practical flow-sensitive and context-sensitive C and C++ memory leak detector , 2003, PLDI '03.

[7]  Michael D. Bond,et al.  Tolerating memory leaks , 2008, OOPSLA.

[8]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[9]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Jianyong Wang,et al.  Out-of-core coherent closed quasi-clique mining from large dense graph databases , 2007, TODS.

[11]  Atanas Rountev,et al.  Precise memory leak detection for java software using container profiling , 2013, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[12]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[13]  Nick Mitchell,et al.  The Runtime Structure of Object Ownership , 2006, ECOOP.

[14]  Robert E. Tarjan,et al.  A fast algorithm for finding dominators in a flowgraph , 1979, TOPL.

[15]  Alexander Aiken,et al.  Context- and path-sensitive memory leak detection , 2005, ESEC/FSE-13.

[16]  Monika Akbar,et al.  Frequent pattern-growth approach for document organization , 2008, ONISW '08.

[17]  Radu Rugina,et al.  Memory Leak Analysis by Contradiction , 2006, SAS.

[18]  Annette Bailey,et al.  LibX - a Firefox extension for enhanced library access , 2006, Libr. Hi Tech.

[19]  Gary Sevitsky,et al.  Visualizing reference patterns for solving memory leaks in Java , 1999, Concurr. Pract. Exp..

[20]  Lawrence B. Holder,et al.  Inference of node and edge replacement graph grammars , 2007 .

[21]  Godmar Back,et al.  "Program, enhance thyself!": demand-driven pattern-oriented program enhancement , 2008, AOSD.

[22]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.

[23]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[24]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[25]  Kathryn S. McKinley,et al.  Cork: dynamic memory leak detection for garbage-collected languages , 2007, POPL '07.

[26]  Jiawei Han,et al.  gApprox: Mining Frequent Approximate Patterns from a Massive Network , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[27]  Philip S. Yu,et al.  Direct Discriminative Pattern Mining for Effective Classification , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[28]  Nick Mitchell,et al.  LeakBot: An Automated and Lightweight Tool for Diagnosing Memory Leaks in Large Java Applications , 2003, ECOOP.

[29]  Lawrence B. Holder,et al.  Substructure Discovery Using Minimum Description Length and Background Knowledge , 1993, J. Artif. Intell. Res..

[30]  Matthias Hauswirth,et al.  Low-overhead memory leak detection using adaptive statistical profiling , 2004, ASPLOS XI.

[31]  Lawrence B. Holder,et al.  Mdl-based context-free graph grammar induction and applications , 2004, Int. J. Artif. Intell. Tools.

[32]  Hartmut Ehrig,et al.  Handbook of graph grammars and computing by graph transformation: vol. 3: concurrency, parallelism, and distribution , 1999 .

[33]  Chao Liu,et al.  Data Mining for Software Engineering , 2009, Computer.

[34]  Michael D. Bond,et al.  Bell: bit-encoding online memory leak detection , 2006, ASPLOS XII.

[35]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[36]  Nick Mitchell,et al.  The causes of bloat, the limits of health , 2007, OOPSLA.

[37]  Grzegorz Rozenberg,et al.  Handbook of Graph Grammars and Computing by Graph Transformations, Volume 1: Foundations , 1997 .