Z-rays: divide arrays and conquer speed and flexibility

Arrays are the ubiquitous organization for indexed data. Throughout programming language evolution, implementations have laid out arrays contiguously in memory. This layout is problematic in space and time. It causes heap fragmentation, garbage collection pauses in proportion to array size, and wasted memory for sparse and over-provisioned arrays. Because of array virtualization in managed languages, an array layout that consists of indirection pointers to fixed-size discontiguous memory blocks can mitigate these problems transparently. This design however incurs significant overhead, but is justified when real-time deadlines and space constraints trump performance. This paper proposes z-rays, a discontiguous array design with flexibility and efficiency. A z-ray has a spine with indirection pointers to fixed-size memory blocks called arraylets, and uses five optimizations: (1) inlining the first N array bytes into the spine, (2) lazy allocation, (3) zero compression, (4) fast array copy, and (5) arraylet copy-on-write. Whereas discontiguous arrays in prior work improve responsiveness and space efficiency, z-rays combine time efficiency and flexibility. On average, the best z-ray configuration performs within 12.7% of an unmodified Java Virtual Machine on 19 benchmarks, whereas previous designs have two to three times higher overheads. Furthermore, language implementers can configure z-ray optimizations for various design goals. This combination of performance and flexibility creates a better building block for past and future array optimization.

[1]  Craig B. Zilles Accordion arrays , 2007, ISMM '07.

[2]  David M. Ungar,et al.  Generation Scavenging: A non-disruptive high performance storage reclamation algorithm , 1984, SDE 1.

[3]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[4]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[5]  Fridtjof Siebert,et al.  Eliminating external fragmentation in a non-moving garbage collector for Java , 2000, CASES '00.

[6]  Stephen M. Blackburn,et al.  Barriers: friend or foe? , 2004, ISMM '04.

[7]  Osman S. Unsal,et al.  Dynamic filtering: multi-purpose architecture support for language runtime systems , 2010, ASPLOS XV.

[8]  Craig Zilles Accordion Arrays : Selective Compression of Unicode Arrays in Java , 2007 .

[9]  Martin C. Rinard,et al.  Data size optimizations for java programs , 2003, LCTES.

[10]  Nicholas Nethercote,et al.  Valgrind: a framework for heavyweight dynamic binary instrumentation , 2007, PLDI '07.

[11]  Darko Stefanovic,et al.  A comparative performance evaluation of write barrier implementation , 1992, OOPSLA '92.

[12]  Henry Lieberman,et al.  A real-time garbage collector based on the lifetimes of objects , 1983, CACM.

[13]  Vivek Sarkar,et al.  ABCD: eliminating array bounds checks on demand , 2000, PLDI '00.

[14]  V. T. Rajan,et al.  Controlling fragmentation and space consumption in the metronome, a real-time garbage collector for Java , 2003, LCTES '03.

[15]  Kathryn S. McKinley,et al.  No bit left behind: the limits of heap data compression , 2008, ISMM '08.

[16]  Jan Vitek,et al.  Schism: fragmentation-tolerant real-time garbage collection , 2010, PLDI '10.

[17]  Toshiaki Yasue,et al.  Design, implementation, and evaluation of optimizations in a just-in-time compiler , 1999, JAVA '99.

[18]  Mahmut T. Kandemir,et al.  Heap compression for memory-constrained Java environments , 2003, OOPSLA.

[19]  Kathryn S. McKinley,et al.  In or out?: putting write barriers in their place , 2002, ISMM '02.

[20]  V. T. Rajan,et al.  A real-time garbage collector with low overhead and consistent utilization , 2003, POPL '03.

[21]  Abraham Silberschatz,et al.  4.2BSD and 4.3BSD as examples of the UNIX system , 1985, CSUR.

[22]  Darko Stefanovic,et al.  A comparative performance evaluation of write barrier implementation , 1992, OOPSLA.

[23]  Michael Wolf,et al.  The pauseless GC algorithm , 2005, VEE '05.

[24]  N. S. Barnett,et al.  Private communication , 1969 .

[25]  Nick Mitchell,et al.  The causes of bloat, the limits of health , 2007, OOPSLA.

[26]  Perry Cheng,et al.  Myths and realities: the performance impact of garbage collection , 2004, SIGMETRICS '04/Performance '04.

[27]  Robert P. Fitzgerald,et al.  The case for profile-directed selection of garbage collectors , 2000, ISMM '00.

[28]  Perry Cheng,et al.  Demystifying magic: high-level low-level programming , 2009, VEE '09.

[29]  Stephen J. Fink,et al.  The Jalapeño virtual machine , 2000, IBM Syst. J..