C4: the continuously concurrent compacting collector

C4, the Continuously Concurrent Compacting Collector, an updated generational form of the Pauseless GC Algorithm [7], is introduced and described, along with details of its implementation on modern X86 hardware. It uses a read barrier to support concur- rent compaction, concurrent remapping, and concurrent incremental update tracing. C4 differentiates itself from other generational garbage collectors by supporting simultaneous-generational concurrency: the different generations are collected using concurrent (non stop-the-world) mechanisms that can be simultaneously and independently active. C4 is able to continuously perform concurrent young generation collections, even during long periods of concurrent full heap collection, allowing C4 to sustain high allocation rates and maintain the efficiency typical to generational collectors, without sacrificing response times or reverting to stop-the-world operation. Azul systems has been shipping a commercial implementation of the Pauseless GC mechanism, since 2005. Three successive generations of Azul's Vega series systems relied on custom multi-core processors and a custom OS kernel to deliver both the scale and features needed to support Pauseless GC. In 2010, Azul released its first software-only commercial implementation of C4 for modern commodity X86 hardware, using Linux kernel enhancements to support the required feature set. We discuss implementa- tion details of C4 on X86, including the Linux virtual and physical memory management enhancements that were used to support the high rate of virtual memory operations required for sustained pauseless operation. We discuss updates to the collector's manage- ment of the heap for efficient generational collection and provide throughput and pause time data while running sustained workloads.

[1]  S. L. Graham,et al.  List Processing in Real Time on a Serial Computer , 1978 .

[2]  Kathryn S. McKinley,et al.  Immix: a mark-region garbage collector with space efficiency, fast collection, and mutator performance , 2008, PLDI '08.

[3]  V. T. Rajan,et al.  A real-time garbage collector with low overhead and consistent utilization , 2003, POPL '03.

[4]  Jan Vitek,et al.  Schism: fragmentation-tolerant real-time garbage collection , 2010, PLDI '10.

[5]  Michael Wolf,et al.  The pauseless GC algorithm , 2005, VEE '05.

[6]  Jinkyu Jeong Memory Management , 1992, Lecture Notes in Computer Science.

[7]  Malcolm Atkinson,et al.  Memory Management , 2021, Professional C++.

[8]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[9]  Erez Petrank,et al.  The Compressor: concurrent, incremental, and parallel compaction , 2006, PLDI '06.

[10]  Rodney A. Brooks,et al.  Trading data space for reduced time and code space in real-time garbage collection on stock hardware , 1984, LFP '84.

[11]  David F. Bacon,et al.  Generational real-time garbage collection: a three-part invention for young objects , 2007 .

[12]  Urs Hölzle,et al.  A Fast Write Barrier for Generational Garbage Collectors , 1993 .

[13]  Filip Pizlo,et al.  Stopless: a real-time garbage collector for multiprocessors , 2007, ISMM '07.

[14]  David Dice,et al.  Supporting per-processor local-allocation buffers using multi-processor restartable critical sections , 2004 .

[15]  David Detlefs,et al.  Garbage-first garbage collection , 2004, ISMM '04.

[16]  Filip Pizlo,et al.  A study of concurrent real-time garbage collectors , 2008, PLDI '08.

[17]  David Detlefs,et al.  A generational mostly-concurrent garbage collector , 2000, ISMM '00.

[18]  Paul R. Wilson,et al.  Uniprocessor Garbage Collection Techniques , 1992, IWMM.

[19]  Eran Yahav,et al.  Correctness-preserving derivation of concurrent garbage collection algorithms , 2006, PLDI '06.