论文信息 - Precise Regression Benchmarking with Random Effects: Improving Mono Benchmark Results

Precise Regression Benchmarking with Random Effects: Improving Mono Benchmark Results

Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctuations caused by compilation. We show that the design of a benchmarking experiment must reflect the existence of the fluctuations if the performance observed during the experiment is to be representative of reality We present a new statistical model of a benchmark experiment that reflects the presence of the fluctuations in compilation, execution and measurement. The model describes the observed performance and makes it possible to calculate the optimum dimensions of the experiment that yield the best precision within a given amount of time Using a variety of benchmarks, we evaluate the model within the context of regression benchmarking. We show that the model significantly decreases the number of erroneously detected performance changes in regression benchmarking

Petr Tuma | Tomas Kalibera

[1] Petr Tuma,et al. Automated detection of performance regressions: the mono experience , 2005, 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[2] Jan Vitek,et al. RTJBench: A Real-Time Java Benchmarking Framework , 2005, Stud. Inform. Univ..

[3] Petr Tuma,et al. Benchmark Precision and Random Initial State , 2005 .

[4] Stuart Barber,et al. All of Statistics: a Concise Course in Statistical Inference , 2005 .

[5] Petr Tuma,et al. CORBA benchmarking: a course with hidden obstacles , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[6] David Finkel,et al. Book review: The Art of Computer Systems Performance Analysis by R. Jain (Wiley-Interscience, 1991) , 1990, PERV.

[7] Petr Tuma,et al. Repeated results analysis for middleware regression benchmarking , 2005, Perform. Evaluation.

[8] T. Kalibera,et al. Mono Regression Benchmarking , 2005 .

[9] Ecma,et al. Common Language Infrastructure (CLI) , 2001 .

[10] Paul Strauss,et al. Novell Inc. , 1993 .

[11] Dayong Gu,et al. Code Layout as a Source of Noise in JVM Performance , 2005, Stud. Inform. Univ..

[12] Raj Jain,et al. The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[13] Larry Wasserman,et al. All of Statistics , 2004 .