Phase Characterization and Classification for Micro-architecture Soft Error

Transient faults have become a key challenge to modern processor design. Processor designers take Architectural Vulnerability Factor (AVF) as an estimation method of micro-architectures soft error rate. Dynamic, phase-based system reliability management, which tunes system hardware and software parameters at runtime for different phases, has become a focus in the field of processor design. Phase characterization technique (PCT) and phase classification algorithm (PCA) determine the accuracy of phase identification, which is the foundation of dynamic, phase-based system management. To our knowledge, this paper is the first to give a comprehensive evaluation and comparison of PCTs and PCAs for micro-architecture soft error. We first compare the efficiency of basic block vectors (BBV) and performance metric counters (PMC) based PCTs in reliability-oriented phase characterization on three micro-architectural structures (i.e. instruction queue, function unit and reorder buffer). Experimental results show that PMC based PCT performs better than BBV based PCT for most programs studied. Also, we compare the accuracy of three clustering algorithms (i.e. hierarchical clustering, k-means clustering and regression tree) in reliability-oriented phase classification. Regression tree method is demonstrated to improve the accuracy of classification by 30% compared with other two PCAs on average. Furthermore, based on the comparisons of PCTs and PCAs, we propose the optimal combination of PCT and PCA for soft error reliability-oriented phase identification—the combination of PMC and regression tree. In addition, we quantify the upper bound of predictability of AVF using BBV/PMC. Overall, an average of 82% AVF can be explained by PMC, while BBV can explain 78% AVF averagely.

[1]  James E. Smith,et al.  Comparing Program Phase Detection Techniques , 2003, MICRO.

[2]  Tao Li,et al.  Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[3]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[4]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[5]  Daniel A. Jiménez,et al.  Toward an evaluation infrastructure for power and energy optimizations , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[6]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[7]  Brad Calder,et al.  SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Anand Sivasubramaniam,et al.  Mechanisms for bounding vulnerabilities of processor structures , 2007, ISCA '07.

[10]  Xiaodong Li,et al.  Online Estimation of Architectural Vulnerability Factor for Soft Errors , 2008, 2008 International Symposium on Computer Architecture.

[11]  Margaret Martonosi,et al.  Phase characterization for power: evaluating control-flow-based and event-counter-based techniques , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[12]  David Chiu,et al.  BOOK REVIEW: "PATTERN CLASSIFICATION", R. O. DUDA, P. E. HART and D. G. STORK, Second Edition , 2001 .

[13]  Brad Calder,et al.  The Strong correlation Between Code Signatures and Performance , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[14]  Todd M. Austin,et al.  A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.

[15]  Sudhanva Gurumurthi,et al.  Dynamic prediction of architectural vulnerability from microarchitectural state , 2007, ISCA '07.

[16]  R. Todi SPEClite: using representative samples to reduce SPEC CPU2000 workload , 2001 .

[17]  Michael C. Huang,et al.  Positional adaptation of processors: application to energy reduction , 2003, ISCA '03.

[18]  Sandhya Dwarkadas,et al.  Characterizing and predicting program behavior and its variability , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.

[19]  Bin Li,et al.  Versatile prediction and fast estimation of Architectural Vulnerability Factor from processor performance metrics , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[20]  Margaret Martonosi,et al.  Identifying program power phase behavior using power vectors , 2003, 2003 IEEE International Conference on Communications (Cat. No.03CH37441).

[21]  Brad Calder,et al.  Phase tracking and prediction , 2003, ISCA '03.

[22]  Brad Calder,et al.  Transition phase classification and prediction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[23]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[24]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[25]  James E. Smith,et al.  Managing multi-configuration hardware via dynamic working set analysis , 2002, ISCA.

[26]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[27]  Ryan N. Rakvic,et al.  The Fuzzy Correlation between Code and Performance Predictability , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).