The growing complexity of underlying systems such as memory hierarchies and speculation mechanisms are making it difficult to perform proper performance evaluations. This is a serious problem especially when we want to know the overheads of adding new functionality to existing languages (or systems/applications), or to know small changes in performance caused by small changes to programs. A problem is that equivalent executable programs, which only differ in their instruction addresses (code placement), often exhibit significantly different performance. This difference can be explained by the fact that code placement affects the underlying branch predictors and instruction cache subsystems. By taking into account such code placement effects, this paper proposes a proper evaluation scheme that cancels accidental factors in code placement by statistically summarizing the performance of a sufficient number of artificial programs that differ from the evaluation target program (almost) only in their code placement. We developed a system, called Code Shaker, that supports performance evaluations based on the proposed scheme.
[1]
Matteo Frigo,et al.
The implementation of the Cilk-5 multithreaded language
,
1998,
PLDI.
[2]
Taiichi Yuasa,et al.
Efficient and Portable Implementation of Java-style Exception Handling in C
,
2006
.
[3]
Amer Diwan,et al.
Blind Optimization for Exploiting Hardware Features
,
2009,
CC.
[4]
Daniel A. Jiménez,et al.
Code placement for improving dynamic branch prediction accuracy
,
2005,
PLDI '05.
[5]
Emery D. Berger,et al.
STABILIZER: statistically sound performance evaluation
,
2013,
ASPLOS '13.
[6]
Matthias Hauswirth,et al.
Producing wrong data without doing anything obviously wrong!
,
2009,
ASPLOS.
[7]
W. W. Hwu,et al.
Achieving high instruction cache performance with an optimizing compiler
,
1989,
ISCA '89.