Code Layout as a Source of Noise in JVM Performance

We describe the effect of a particular form of "noise" in benchmarking. We investigate the source of anomalous measurement data in a series of optimization strategies that attempt to improve runtime performance in the garbage collector of a Java virtual machine. The results of our experiments can be explained in terms of the difference in code layout, and hence instruction and data cache behaviour. We show that unintended changes in code layout due to code modifications as trivial as symbol renaming can contribute up to 2.7% of measured machine cycle cost, 20% in data cache misses, and 37% in instruction cache misses.

[1]  Kathryn S. McKinley,et al.  Older-first garbage collection in practice: evaluation in Java Virtual Machine , 2002, MSP/ISMM.

[2]  Etienne Gagnon,et al.  A portable research framework for the execution of java bytecode , 2003 .

[3]  Sverker Holmgren,et al.  Cache Memory Behavior of Advanced PDE Solvers , 2003, PARCO.

[4]  Nir Shavit,et al.  Parallel Garbage Collection for Shared Memory Multiprocessors , 2001, Java Virtual Machine Research and Technology Symposium.

[5]  Elliot K. Kolodner,et al.  A parallel, incremental and concurrent GC for servers , 2002, PLDI '02.

[6]  D. B. Davis,et al.  Intel Corp. , 1993 .

[7]  Lieven Eeckhout,et al.  How java programs interact with virtual machines at the microarchitectural level , 2003, OOPSLA.

[8]  Hans-Juergen Boehm,et al.  Reducing garbage collector cache misses , 2000, ISMM '00.

[9]  Karl Pettis,et al.  Profile guided code positioning , 1990, PLDI '90.

[10]  Erik Hagersten,et al.  Bundling: reducing the overhead of multiprocessor prefetchers , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[11]  Rafael Dueire Lins,et al.  Garbage collection: algorithms for automatic dynamic memory management , 1996 .

[12]  Scott Shenker,et al.  Mostly parallel garbage collection , 1991, PLDI '91.

[13]  Matthias Hauswirth,et al.  Using Hardware Performance Monitors to Understand the Behavior of Java Applications , 2004, Virtual Machine Research and Technology Symposium.

[14]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[15]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[16]  Kathryn S. McKinley,et al.  Beltway: getting around garbage collection gridlock , 2002, PLDI '02.

[17]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[18]  David Detlefs,et al.  A generational mostly-concurrent garbage collector , 2000, ISMM '00.

[19]  Kathryn S. McKinley,et al.  Age-based garbage collection , 1999, OOPSLA '99.

[20]  Laurie J. Hendren,et al.  A Comprehensive Approach to Array Bounds Check Elimination for Java , 2002, CC.

[21]  J. Moss,et al.  Older-first garbage collection in practice: evaluation in a Java Virtual Machine , 2003, MSP '02.

[22]  Brad Calder,et al.  Pointer cache assisted prefetching , 2002, MICRO.

[23]  Youngsoo Choi,et al.  Design and Experience : Using the Intel ® Itanium ® 2 Processor Performance Monitoring Unit to Implement Feedback Optimizations , 2002 .