Quantifying and Improving the Performance of Garbage Collection

Researchers have devoted decades towards improving garbage collection performance. A key question is: what opportunity remains for performance improvement? Because the perceived performance disadvantage of garbage collection is often cited as a barrier to its adoption, we propose using explicit memory management as the touchstone for evaluating garbage collection performance. Unfortunately, programmers using garbage-collected languages do not specify when to deallocate memory, precluding a direct comparison with explicit memory management. We observe that we can simulate the effect of explicit memory management by profiling applications and generating object reachability information and lifetimes to serve as zero-cost oracles to guide explicit memory reclamation. We call this approach oracular memory management. However, existing methods of generating exact object reachability are prohibitively expensive. We first present the object reachability time algorithm, which we call Merlin. We show how Merlin runs in time proportional to the running time of the application and can be performed either on-line or off-line. We then present empirical results showing that Merlin requires orders of magnitude less time than brute force trace generation. We next present simulation results from experiments with the oracular memory manager. These results show that when memory is plentiful, garbage collection performance is competitive with explicit memory management, occasionally increasing program throughput by 10%. This performance comes at a price. To match the average performance of explicit memory management, the best performing garbage collector requires an average heap size that is two-and-a-half or five times larger, depending on how aggressively the explicit memory manager frees objects. This performance gap is especially large when the heap exceeds physical memory. We show the garbage-collector causes poor paging performance. During a full heap collection, the collector scans all reachable objects. Which a reachable object resides on a non-resident page, the virtual memory manager reloads that page and evicts another. If the evicted page also contains reachable objects, ultimately it too will be accessed by the collector and reloaded into physical memory. This paging is unavoidable without cooperation between the garbage collector and the virtual memory manager. We then present a cooperative garbage collection algorithm that we call bookmarking collection (BC). BC works with the virtual memory manager to decide which pages to evict. By providing available empty pages and, when paging is inevitable, processing pages prior to their eviction, BC avoids triggering page faults during garbage collection. When paging, BC reduces execution time over the next best collector by up to a factor of 3 and reduces maximum pause times by over an order of magnitude versus GenMS. This thesis begins by asking how well garbage collection performs currently and which areas remain for it to improve. We develop oracular memory management to help answer this question by enabling us to quantify garbage collection's performance. Our results show that while garbage collection can perform competitively, this good performance assumes the heap fits in available memory. Our attention thus focused onto this brittleness, we design and implement the bookmarking collector. While performing competitively when not paging, our new collector's paging performance helps make garbage collection's performance must more robust.

[1]  Kathryn S. McKinley,et al.  On Models for Object Lifetimes , 2000, International Symposium on Mathematical Morphology and Its Application to Signal and Image Processing.

[2]  Daniel G. Bobrow,et al.  A note on the efficiency of a LISP computation in a paged machine , 1968, CACM.

[3]  Henry Lieberman,et al.  A real-time garbage collector based on the lifetimes of objects , 1983, CACM.

[4]  Marc Shapiro,et al.  A Survey of Distributed Garbage Collection Techniques , 1995, IWMM.

[5]  David A. Moon,et al.  Garbage collection in a large LISP system , 1984, LFP '84.

[6]  David R. Hanson Fast allocation and deallocation of memory based on object lifetimes , 1990, Softw. Pract. Exp..

[7]  Paul R. Wilson,et al.  Object Type Directed Garbage Collection To Improve Locality , 1992, IWMM.

[8]  Lieven Eeckhout,et al.  How java programs interact with virtual machines at the microarchitectural level , 2003, OOPSLA '03.

[9]  John H. Reppy A High-performance Garbage Collector for Standard ML , 1993 .

[10]  Elliot K. Kolodner,et al.  On effectiveness of GC in Java , 2000, ISMM '00.

[11]  Erik Corry Optimistic stack allocation for java-like languages , 2006, ISMM '06.

[12]  Dirk Grunwald,et al.  Evaluating models of memory allocation , 1994, TOMC.

[13]  Graem A. Ringwood,et al.  Garbage collecting the Internet: a survey of distributed garbage collection , 1998, CSUR.

[14]  Douglas T. Ross The AED free storage package , 1967, CACM.

[15]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[16]  Stephanie Forrest,et al.  Object Lifetime Prediction in Java , 2003 .

[17]  Maged M. Michael Scalable lock-free dynamic memory allocation , 2004, PLDI '04.

[18]  Daniel G. Bobrow,et al.  Structure of a LISP system using two-level storage , 1967, CACM.

[19]  Hans-Juergen Boehm,et al.  Reducing garbage collector cache misses , 2000, ISMM '00.

[20]  James R. Larus,et al.  Using generational garbage collection to implement cache-conscious data placement , 1998, ISMM '98.

[21]  Emery D. Berger,et al.  Automatic heap sizing: taking real memory into account , 2004, ISMM '04.

[22]  Jacques Cohen,et al.  Garbage Collection of Linked Data Structures , 1981, CSUR.

[23]  Jin-Soo Kim,et al.  Memory system behavior of Java programs: methodology and analysis , 2000, SIGMETRICS '00.

[24]  Erik Ruf,et al.  Marmot: an optimizing compiler for Java , 2000 .

[25]  Erik Ruf,et al.  Marmot: an optimizing compiler for Java , 2000, Softw. Pract. Exp..

[26]  Doug Burger,et al.  Evaluating Future Microprocessors: the SimpleScalar Tool Set , 1996 .

[27]  S. C. Vestal,et al.  Garbage collection: an exercise in distributed, fault-tolerant programming , 1987 .

[28]  V. T. Rajan,et al.  Controlling fragmentation and space consumption in the metronome, a real-time garbage collector for Java , 2003, LCTES '03.

[29]  Kathryn S. McKinley,et al.  Ulterior reference counting: fast garbage collection without a long wait , 2003, OOPSLA '03.

[30]  V. T. Rajan,et al.  An efficient on-the-fly cycle collection , 2005, TOPL.

[31]  Perry Cheng,et al.  Myths and realities: the performance impact of garbage collection , 2004, SIGMETRICS '04/Performance '04.

[32]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[33]  V. T. Rajan,et al.  Concurrent Cycle Collection in Reference Counted Systems , 2001, ECOOP.

[34]  F LimTian,et al.  A memory-efficient real-time non-copying garbage collector , 1998 .

[35]  Amer Diwan,et al.  Connectivity-based garbage collection , 2003, OOPSLA '03.

[36]  Kathryn S. McKinley,et al.  Free-Me: a static analysis for automatic individual object reclamation , 2006, PLDI '06.

[37]  Andrew W. Appel,et al.  Garbage Collection can be Faster than Stack Allocation , 1987, Inf. Process. Lett..

[38]  Peter Lee,et al.  Generational stack collection and profile-driven pretenuring , 1998, PLDI.

[39]  Paul R. Wilson,et al.  Dynamic Storage Allocation: A Survey and Critical Review , 1995, IWMM.

[40]  Mauricio J. Serrano,et al.  Prefetch injection based on hardware monitoring and object metadata , 2004, PLDI '04.

[41]  Bill Venners,et al.  Inside the Java Virtual Machine , 1997 .

[42]  Andrew W. Appel,et al.  Simple generational garbage collection and fast allocation , 1989, Softw. Pract. Exp..

[43]  BlanchetBruno Escape analysis for object-oriented languages , 1999 .

[44]  Sigmund Cherem,et al.  Compile-time deallocation of individual objects , 2006, ISMM '06.

[45]  Guanshan Tong,et al.  Leveled Garbage Collection , 2001, J. Funct. Log. Program..

[46]  Kathryn S. McKinley,et al.  Pretenuring for Java , 2001, OOPSLA '01.

[47]  Paul R. Wilson,et al.  Uniprocessor Garbage Collection Techniques , 1992, IWMM.

[48]  Chris J. Cheney A nonrecursive list compacting algorithm , 1970, Commun. ACM.

[49]  Kathryn S. McKinley,et al.  Error-free garbage collection traces: how to cheat and not get caught , 2002, SIGMETRICS '02.

[50]  Hans-Juergen Boehm,et al.  Garbage collection in an uncooperative environment , 1988, Softw. Pract. Exp..

[51]  Guy E. Blelloch,et al.  A parallel, real-time garbage collector , 2001, PLDI '01.

[52]  J. Eliot B. Moss,et al.  Mark-copy: fast copying GC with less space overhead , 2003, OOPSLA '03.

[53]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[54]  Perry Cheng,et al.  Oil and water? High performance garbage collection in Java with MMTk , 2004, Proceedings. 26th International Conference on Software Engineering.

[55]  Paul R. Wilson,et al.  Effective “static-graph” reorganization to improve locality in garbage-collected systems , 1991, PLDI '91.

[56]  George E. Collins,et al.  A method for overlapping and erasure of lists , 1960, CACM.

[57]  Kathryn S. McKinley,et al.  Composing high-performance memory allocators , 2001, PLDI '01.

[58]  Emery D. Berger,et al.  MC2: high-performance garbage collection for memory-constrained environments , 2004, OOPSLA.

[59]  Perry Cheng,et al.  The garbage collection advantage: improving program locality , 2004, OOPSLA.

[60]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[61]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[62]  Yannis Smaragdakis,et al.  The EELRU adaptive replacement algorithm , 2003, Perform. Evaluation.

[63]  Tim Brecht,et al.  Controlling garbage collection and heap growth to reduce the execution time of Java applications , 2006, TOPL.

[64]  Eric Cooper,et al.  Improving the performance of SML garbage collection using application-specific virtual memory management , 1992, LFP '92.

[65]  Robert Fenichel,et al.  A LISP garbage-collector for virtual-memory computer systems , 1969, CACM.

[66]  Amer Diwan,et al.  On the usefulness of type and liveness accuracy for garbage collection and leak detection , 2002, TOPL.

[67]  Mooly Sagiv,et al.  Estimating the impact of heap liveness information on space consumption in Java , 2002, ISMM '02.

[68]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[69]  Bruno Blanchet,et al.  Escape analysis for object-oriented languages: application to Java , 1999, OOPSLA '99.

[70]  Robert Courts,et al.  Improving locality of reference in a garbage-collecting memory management system , 1988, CACM.

[71]  Kathryn S. McKinley,et al.  Dynamic SimpleScalar: Simulating Java Virtual Machines , 2003 .

[72]  Kathryn S. McKinley,et al.  Reconsidering custom memory allocation , 2002, OOPSLA '02.

[73]  David Detlefs,et al.  Garbage collection and local variable type-precision and liveness in Java virtual machines , 1998, PLDI.

[74]  Peter Boehler Bishop,et al.  Computer systems with a very large address space and garbage collection , 1977 .

[75]  Emery D. Berger,et al.  Quantifying the performance of garbage collection vs. explicit memory management , 2005, OOPSLA '05.

[76]  Benjamin G. Zorn,et al.  The measured cost of conservative garbage collection , 1993, Softw. Pract. Exp..

[77]  Amer Diwan,et al.  Memory system performance of programs with intensive heap allocation , 1995, TOCS.

[78]  Erez Petrank,et al.  A generational on-the-fly garbage collector for Java , 2000, PLDI '00.

[79]  Daniel G. Bobrow,et al.  An efficient, incremental, automatic garbage collector , 1976, CACM.

[80]  Paul R. Wilson,et al.  The memory fragmentation problem: solved? , 1998, ISMM '98.

[81]  Amer Diwan,et al.  On the Usefulness of Liveness for Garbage Collection and Leak Detection , 2001, ECOOP.

[82]  Kathryn S. McKinley,et al.  Beltway: getting around garbage collection gridlock , 2002, PLDI '02.

[83]  Kathryn S. McKinley,et al.  Memory management for high-performance applications , 2002 .

[84]  Andrew W. Appel,et al.  An advisor for flexible working sets , 1990, SIGMETRICS '90.

[85]  David L. Detlefs,et al.  Concurrent garbage collection for C , 1990 .

[86]  Rafael Dueire Lins,et al.  Garbage collection: algorithms for automatic dynamic memory management , 1996 .

[87]  Dylan McNamee,et al.  Extending the Mach External Pager Interface to Accomodate User-Level Page Replacement Policies , 1990, USENIX MACH Symposium.

[88]  V. T. Rajan,et al.  A real-time garbage collector with low overhead and consistent utilization , 2003, POPL '03.

[89]  Donald E. Knuth,et al.  The art of computer programming: V.1.: Fundamental algorithms , 1997 .

[90]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[91]  Guy L. Steele,et al.  Multiprocessing compactifying garbage collection , 1975, CACM.

[92]  S. L. Graham,et al.  List Processing in Real Time on a Serial Computer , 1978 .