Analyzing the effects of compiler optimizations on application reliability

As transistor sizes decrease, transient faults are becoming a significant concern for processor designers. A rich body of research has focused on ways to estimate the vulnerability of systems to transient errors and on techniques to reduce their sensitivity to soft errors. In this research, we analyze how compiler optimizations impact the expected number of failures during the execution of an application. Typically, optimizations have two effects. First, they increase structures occupancies by allowing more instructions in flight, which in turn increases their susceptibility to soft errors. Additionally, they decrease execution time, decreasing the time during which the application is exposed to transient errors. In particular, we focus on how optimizations impact occupancies in three processor structures, namely the Reorder Buffer, the Instruction Fetch Queue and the Load Store Queue. We explain the interplay between compiler and reliability by studying the changes in the code made by the compiler and the resulting responses at the microarchitectural level. Results from this research allow us to make decisions to keep an application within its performance goals and its vulnerability during its runtime within a well defined FIT target.

[1]  T. Mudge,et al.  Drowsy caches: simple techniques for reducing leakage power , 2002, Proceedings 29th Annual International Symposium on Computer Architecture.

[2]  Michael F. P. O'Boyle,et al.  Evaluating the Effects of Compiler Optimisations on AVF , 2008 .

[3]  Gabriel H. Loh,et al.  Zesto: A cycle-level simulator for highly detailed microarchitecture exploration , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[4]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.

[5]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[6]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .

[7]  David I. August,et al.  Software-controlled fault tolerance , 2005, TACO.

[8]  Bin Li,et al.  Versatile prediction and fast estimation of Architectural Vulnerability Factor from processor performance metrics , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[9]  Joel Emer,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[10]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[11]  Rudolf Eigenmann,et al.  Fast and effective orchestration of compiler optimizations for automatic performance tuning , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[12]  Xiaodong Li,et al.  Online Estimation of Architectural Vulnerability Factor for Soft Errors , 2008, 2008 International Symposium on Computer Architecture.

[13]  Toshinori Sato,et al.  Formulating MITF for a Multicore Processor with SEU Tolerance , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[14]  Bin Li,et al.  Efficient Microarchitectural Vulnerabilities Prediction Using Boosted Regression Trees and Patient Rule Inductions , 2010, IEEE Transactions on Computers.

[15]  Arijit Biswas,et al.  Computing architectural vulnerability factors for address-based structures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[16]  Craig B. Zilles,et al.  A characterization of instruction-level error derating and its implications for error detection , 2008, 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC (DSN).

[17]  Jun Yan,et al.  Evaluating instruction cache vulnerability to transient errors , 2006, MEDEA '06.

[18]  Robert Baumann,et al.  Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.

[19]  Wei Zhang,et al.  Computing cache vulnerability to transient errors and its implication , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[20]  Sudhanva Gurumurthi,et al.  Dynamic prediction of architectural vulnerability from microarchitectural state , 2007, ISCA '07.