SubsetTrio: An evolutionary, geometric, and statistical benchmark subsetting framework

Motivated by excessively high benchmarking efforts caused by a rapidly expanding design space, increasing system complexity, and prevailing practices based on ad-hoc and subjective schemes, this article seeks to enhance architecture exploration and evaluation efficiency by strategically integrating a genetic algorithm, 3-D geometrical rendering, and multivariate statistical analysis into one unified methodology framework—SubsetTrio—capable of subsetting any given benchmark suite based on its inherent workload characteristics, desired workload space coverage, and the total execution time intended by the user. By encoding both representativity (i.e., workload space coverage represented by the volume of the convex hull of benchmarks) and efficiency (i.e., total run time) as a co-optimization objective of a survival-of-the-fittest evolutionary algorithm, we can systematically determine a globally “fittest” (i.e., most representative and efficient) benchmark subset according to the workload space coverage threshold specified by the user. We demonstrate the usage, efficacy, and efficiency of the proposed technique by conducting a thorough case study on the SPEC benchmark suite, and evaluate its validity based on 50 commercial computer systems. Compared to the state-of-the-art statistical subsetting approach based on the Principal Component Analysis (PCA), SubsetTrio could select a significantly more time-efficient subset, while covering the same or higher workload space.

[1]  Lieven Eeckhout,et al.  Evaluating Benchmark Subsetting Approaches , 2006, 2006 IEEE International Symposium on Workload Characterization.

[2]  Lieven Eeckhout,et al.  The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools , 2006, ICS '06.

[3]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[4]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[5]  Lieven Eeckhout,et al.  Quantifying the Impact of Input Data Sets on Program Behavior and its Applications , 2003, J. Instr. Level Parallelism.

[6]  A. J. KleinOsowski,et al.  MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research , 2002, IEEE Computer Architecture Letters.

[7]  Lizy Kurian John,et al.  Subsetting the SPEC CPU2006 benchmark suite , 2007, CARN.

[8]  Thomas F. Wenisch,et al.  Applying SMARTS to SPEC CPU20001 , 2003 .

[9]  Lieven Eeckhout,et al.  Designing Computer Architecture Research Workloads , 2003, Computer.

[10]  Lieven Eeckhout,et al.  Comparing Benchmarks Using Key Microarchitecture-Independent Characteristics , 2006, 2006 IEEE International Symposium on Workload Characterization.

[11]  Lieven Eeckhout,et al.  Measuring Program Similarity , 2005 .

[12]  Thomas M. Conte,et al.  Reverse State Reconstruction for Sampled Microarchitectural Simulation , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[13]  Thomas F. Wenisch,et al.  SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling , 2003, ISCA '03.

[14]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[15]  Godfrey C. Onwubolu,et al.  New optimization techniques in engineering , 2004, Studies in Fuzziness and Soft Computing.

[16]  Daniel Citron,et al.  The harmonic or geometric mean: does it really matter? , 2006, CARN.

[17]  Joseph O'Rourke,et al.  Handbook of Discrete and Computational Geometry, Second Edition , 1997 .

[18]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[19]  Daniel Citron MisSPECulation: partial and misleading use of SPEC CPU2000 in computer architecture conferences , 2003, ISCA '03.

[20]  Thomas F. Wenisch,et al.  TurboSMARTS: accurate microarchitecture simulation sampling in minutes , 2005, SIGMETRICS '05.

[21]  Lieven Eeckhout,et al.  Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[22]  Zhanpeng Jin,et al.  Improve simulation efficiency using statistical benchmark subsetting - An implantbench case study , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[23]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[24]  John R. Mashey,et al.  War of the benchmark means: time for a truce , 2004, CARN.

[25]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[26]  Douglas M. Hawkins,et al.  A statistically rigorous approach for improving simulation methodology , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[27]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[28]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[29]  A. Fraser,et al.  Computer models in genetics , 1970 .

[30]  The Use and Abuse of SPEC: An ISCA Panel , 2003, IEEE Micro.

[31]  Lieven Eeckhout,et al.  Measuring benchmark similarity using inherent program characteristics , 2006, IEEE Transactions on Computers.

[32]  L. Eeckhout,et al.  Exploiting program microarchitecture independent characteristics and phase behavior for reduced benchmark suite simulation , 2005, IEEE International. 2005 Proceedings of the IEEE Workload Characterization Symposium, 2005..

[33]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.

[34]  Zhanpeng Jin,et al.  Evolutionary Benchmark Subsetting , 2008, IEEE Micro.

[35]  Gary B. Lamont,et al.  Evolutionary Algorithms for Solving Multi-Objective Problems (Genetic and Evolutionary Computation) , 2006 .

[36]  G. Martínez Genetic Algorithms Applied to Clustering , 2003 .

[37]  Lizy Kurian John,et al.  Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007, ISCA '07.

[38]  F. P. Preparata,et al.  Convex hulls of finite sets of points in two and three dimensions , 1977, CACM.

[39]  Lieven Eeckhout,et al.  Workload design: selecting representative program-input pairs , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[40]  Lieven Eeckhout,et al.  Phase Complexity Surfaces: Characterizing Time-Varying Program Behavior , 2008, HiPEAC.

[41]  John L. Hennessy,et al.  The use and abuse of SPEC: An ISCA panel , 2003 .

[42]  Douglas M. Hawkins,et al.  Improving computer architecture simulation methodology by adding statistical rigor , 2005, IEEE Transactions on Computers.