Influence of program inputs on the selection of garbage collectors

Many studies have shown that the best performer among a set of garbage collectors tends to be different for different applications. Researchers have proposed application-specific selection of garbage collectors. In this work, we concentrate on a second dimension of the problem: the influence of program inputs on the selection of garbage collectors. We collect tens to hundreds of inputs for a set of Java benchmarks, and measure their performance on Jikes RVM with different heap sizes and garbage collectors. A rigorous statistical analysis produces four-fold insights. First, inputs influence the relative performance of garbage collectors significantly, causing large variations to the top set of garbage collectors across inputs. Profiling one or few runs is thus inadequate for selecting the garbage collector that works well for most inputs. Second, when the heap size ratio is fixed, one or two types of garbage collectors are enough to stimulate the top performance of the program on all inputs. Third, for some programs, the heap size ratio significantly affects the relative performance of different types of garbage collectors. For the selection of garbage collectors on those programs, it is necessary to have a cross-input predictive model that predicts the minimum possible heap size of the execution on an arbitrary input. Finally, based on regression techniques, we demonstrate the predictability of the minimum possible heap size, indicating the potential feasibility of the input-specific selection of garbage collectors.

[1]  Perry Cheng,et al.  Oil and water? High performance garbage collection in Java with MMTk , 2004, Proceedings. 26th International Conference on Software Engineering.

[2]  Robert P. Fitzgerald,et al.  The case for profile-directed selection of garbage collectors , 2000, ISMM '00.

[3]  Yutao Zhong,et al.  Predicting whole-program locality through reuse distance analysis , 2003, PLDI '03.

[4]  Gavin Brown,et al.  Intelligent selection of application-specific garbage collectors , 2007, ISMM '07.

[5]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[6]  Adam Welc,et al.  Improving virtual machine performance using a cross-run profile repository , 2005, OOPSLA '05.

[7]  Chen Ding,et al.  Predicting locality phases for dynamic memory optimization , 2007, J. Parallel Distributed Comput..

[8]  Tony Printezis,et al.  Hot-Swapping Between a Mark&Sweep and a Mark&Compact Garbage Collector in a Generational Environment , 2001, Java Virtual Machine Research and Technology Symposium.

[9]  Feng Mao,et al.  Cross-Input Learning and Discriminative Prediction in Evolvable Virtual Machines , 2009, 2009 International Symposium on Code Generation and Optimization.

[10]  J. N. Amaral,et al.  Benchmark Design for Robust Profile-Directed Optimization , 2007 .

[11]  David W. Wall,et al.  Predicting program behavior using real or estimated profiles , 2004, SIGP.

[12]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[13]  J. Gregory Morrisett,et al.  Comparing mostly-copying and mark-sweep conservative collection , 1998, ISMM '98.

[14]  Matthew Arnold,et al.  Adaptive optimization in the Jalapeño JVM , 2000, OOPSLA '00.

[15]  Feng Mao,et al.  Modeling Relations between Inputs and Dynamic Behavior for General Programs , 2007, LCPC.

[16]  Benjamin G. Zorn,et al.  Comparing mark-and sweep and stop-and-copy garbage collection , 1990, LISP and Functional Programming.

[17]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[18]  Perry Cheng,et al.  Myths and realities: the performance impact of garbage collection , 2004, SIGMETRICS '04/Performance '04.

[19]  Lieven Eeckhout,et al.  Quantifying the Impact of Input Data Sets on Program Behavior and its Applications , 2003, J. Instr. Level Parallelism.

[20]  Chandra Krintz,et al.  Dynamic selection of application-specific garbage collectors , 2004, ISMM '04.