On the predictability of program behavior using different input data sets

Smaller input data sets such as the test and the train input sets are commonly used in simulation to estimate the impact of architecture/micro-architecture features on the performance of SPEC benchmarks. They are also used for profile feedback compiler optimizations. In this paper, we examine the reliability of reduced input sets for performance simulation and profile feedback optimizations. We study the high level metrics such as IPC and procedure level profiles as well as lower level measurements such as execution paths exercised by various input sets on the SPEC2000int benchmark. Our study indicates that the test input sets are not suitable to be used for simulation because they do not have an execution profile similar to the reference input runs. The train data set is better than the test data sets at maintaining similar profiles to the reference input set. However, the observed execution paths leading to cache misses are very different between using the smaller input sets and the reference input sets. For current profile based optimizations, the differences in quality of profiles may not have a significant impact on performance, as tested on the Itanium processor with an Intel compiler. However, we believe the impact of profile quality will be greater for more aggressive profile guided optimizations, such as cache prefetching.

[1]  Robert S. Cohn,et al.  Optimizing Alpha Executables on Windows NT with Spike , 1998, Digit. Tech. J..

[2]  Andrew Ayers,et al.  Scalable cross-module optimization , 1998, PLDI '98.

[3]  Joseph A. Fisher,et al.  Predicting conditional branch directions from previous runs of a program , 1992, ASPLOS V.

[4]  Robert Yung Design decisions influencing the UltraSPARC's instruction fetch architecture , 1996, MICRO.

[5]  Chandra Krintz,et al.  Cache-conscious data placement , 1998, ASPLOS VIII.

[6]  Susan L. Graham,et al.  Gprof: A call graph execution profiler , 1982, SIGPLAN '82.

[7]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[8]  Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures , 2002, Proceedings Sixth Annual Workshop on Interaction between Compilers and Computer Architectures.

[9]  Anne M. Holler Optimization for a superscalar out-of-order machine , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[10]  D. B. Davis,et al.  Intel Corp. , 1993 .

[11]  Brad Calder,et al.  Value Profiling and Optimization , 1999, J. Instr. Level Parallelism.

[12]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.