Workload Reduction for Multi-input Feedback-Directed Optimization

Feedback-directed optimization is an effective technique to improve program performance, but it may result in program performance and compiler behavior that is sensitive to both the selection of inputs used for training and the actual input in each run of the program. Cross-validation over a workload of inputs can address the input-sensitivity problem, but introduces the need to select a representative workload of minimal size from the population of available inputs. We present a compiler-centric clustering methodology to group similar inputs so that redundant inputs can be eliminated from the training workload. Input similarity is determined based on the compile-time code transformations made by the compiler after training separately on each input. Differences between inputs are weighted by a performance metric based on cross-validation in order to account for code transformation differences that have little impact on performance. We introduce the CrossError metric that allows the exploration of correlations between transformations based on the results of clustering. The methodology is applied to several SPEC benchmark programs, and illustrated using selected case studies.

[1]  Lieven Eeckhout,et al.  Comparing Benchmarks Using Key Microarchitecture-Independent Characteristics , 2006, 2006 IEEE International Symposium on Workload Characterization.

[2]  Michael F. P. O'Boyle,et al.  MiDataSets: Creating the Conditions for a More Realistic Evaluation of Iterative Optimization , 2007, HiPEAC.

[3]  A. J. KleinOsowski,et al.  MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.

[4]  Feng Mao,et al.  Modeling Relations between Inputs and Dynamic Behavior for General Programs , 2007, LCPC.

[5]  Lawrence Spracklen,et al.  Evaluating the correspondence between training and reference workloads in SPEC CPU2006 , 2007, CARN.

[6]  Lizy Kurian John,et al.  Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007, ISCA '07.

[7]  J. N. Amaral,et al.  Benchmark Design for Robust Profile-Directed Optimization , 2007 .

[8]  D.J. Lilja,et al.  Accurate statistical approaches for generating representative workload compositions , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[9]  José Nelson Amaral,et al.  Aestimo: a feedback-directed optimization evaluation tool , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.

[10]  Wei Tsang Ooi,et al.  Identifying "representative" workloads in designing MpSoC platforms for media processing , 2004, 2nd Workshop onEmbedded Systems for Real-Time Multimedia, 2004. ESTImedia 2004..

[11]  H. Vandierendonck,et al.  Experiments with subsetting benchmark suites , 2004, IEEE International Workshop on Workload Characterization, 2004. WWC-7. 2004.

[12]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[13]  Lieven Eeckhout,et al.  Workload design: selecting representative program-input pairs , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[14]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.