ORBIT: Effective Issue Queue Soft-Error Vulnerability Mitigation on Simultaneous Multithreaded Architectures Using Operand Readiness-Based Instruction Dispatch

With the advance of semiconductor processing technology, soft errors have become an increasing cause of failures of microprocessors fabricated using smaller and more densely integrated transistors with lower threshold voltages and tighter noise margins. With diminishing performance returns on wider issue superscalar processors, the microprocessor design industry has opted for using simultaneous multithreaded (SMT) architectures in commercial processors to exploit thread-level parallelism (TLP). SMT techniques enhance overall system performance but also introduce greater susceptibility to soft errors - concurrently executing multiple threads exposes many program runtime states to soft-error strikes at any given time. The issue queue (IQ) is a key micro architecture structure to exploit instruction-level and thread-level parallelism. On SMT processors, the IQ buffers a large number of instructions from multiple threads and is more susceptible to soft-error strikes. In this paper, we explore the use of operand-readiness-based instruction dispatch (ORBIT) as an effective mechanism to mitigate IQ soft-error vulnerability on SMT processors. We observe that IQ soft-error vulnerability is largely affected by instructions waiting for their source operands. The overall IQ soft-error vulnerability can be effectively reduced by minimizing the number of waiting instructions and their residency cycles in the IQ. We develop six techniques that aim to improve IQ reliability with negligible performance degradation on SMT processors. Moreover, we extend our techniques with prediction methods that can anticipate the readiness of source operands ahead of time. The ORBIT schemes integrated with reliability-awareness and readiness prediction achieve more attractive reliability/performance trade-offs. The best of the proposed schemes (e.g. Predict_DelayACE) reduces IQ vulnerability by 79% with only 1% throughput IPC and 3% harmonic IPC reduction across all studied workloads.

[1]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[2]  Glenn Reinman,et al.  Scaling the issue window with look-ahead latency prediction , 2004, ICS '04.

[3]  Y. C. Yeh,et al.  Triple-triple redundant 777 primary flight computer , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.

[4]  Quming Zhou,et al.  Design optimization for single-event upset robustness using simultaneous dual-VDD and sizing techniques , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[5]  Dean M. Tullsen,et al.  Handling long-latency loads in a simultaneous multithreading processor , 2001, MICRO.

[6]  Huiyang Zhou,et al.  A case for fault tolerance and performance enhancement using chip multi-processors , 2006, IEEE Computer Architecture Letters.

[7]  Todd M. Austin,et al.  Cyclone: a broadcast-free dynamic instruction scheduler with selective replay , 2003, ISCA '03.

[8]  Sanjay J. Patel,et al.  Reducing the Scheduling Critical Cycle Using Wakeup Prediction , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[9]  Steven K. Reinhardt,et al.  A scalable instruction queue design using dependence chains , 2002, ISCA.

[10]  Jack L. Lo,et al.  Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).

[11]  Arun K. Somani,et al.  Soft error sensitivity characterization for microprocessor dependability enhancement strategy , 2002, Proceedings International Conference on Dependable Systems and Networks.

[12]  Joseph J. Sharkey,et al.  Efficient instruction schedulers for SMT processors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[13]  Glenn Reinman,et al.  Just say no: benefits of early cache miss determination , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[14]  Arijit Biswas,et al.  Computing architectural vulnerability factors for address-based structures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[15]  Michael F. P. O'Boyle,et al.  Software directed issue queue power reduction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[16]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[17]  Ramon Canal,et al.  A low-complexity issue logic , 2000, ICS '00.

[18]  Pierre Michaud,et al.  Data-flow prescheduling for large instruction windows in out-of-order processors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[19]  Ramon Canal,et al.  Reducing the complexity of the issue logic , 2001, ICS '01.

[20]  Kai Wang,et al.  Highly accurate data value prediction using hybrid predictors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[21]  James E. Smith,et al.  Saving energy with just in time instruction delivery , 2002, ISLPED '02.

[22]  Anand Sivasubramaniam,et al.  SlicK: slice-based locality exploitation for efficient redundant multithreading , 2006, ASPLOS XII.

[23]  Mehdi Baradaran Tahoori,et al.  Balancing Performance and Reliability in the Memory Hierarchy , 2005, IEEE International Symposium on Performance Analysis of Systems and Software, 2005. ISPASS 2005..

[24]  Steven R. Kunkel,et al.  A multithreaded PowerPC processor for commercial servers , 2000, IBM J. Res. Dev..

[25]  Vladimir Stojanovic,et al.  A cost-effective implementation of an ECC-protected instruction queue for out-of-order microprocessors , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[26]  Joel S. Emer,et al.  Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[27]  D. Marr,et al.  Hyper-Threading Technology Architecture and MIcroarchitecture , 2002 .

[28]  Shubhendu S. Mukherjee,et al.  Transient fault detection via simultaneous multithreading , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[29]  Todd M. Austin,et al.  A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.

[30]  David H. Albonesi,et al.  Front-end policies for improved issue efficiency in SMT processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[31]  James E. Smith,et al.  Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[32]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[33]  Tao Li,et al.  Characterizing Microarchitecture Soft Error Vulnerability Phase Behavior , 2006, 14th IEEE International Symposium on Modeling, Analysis, and Simulation.

[34]  Xin Fu,et al.  An Analysis of Microarchitecture Vulnerability to Soft Errors on Simultaneous Multithreaded Architectures , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[35]  David Kaeli,et al.  Reliability in the Shadow of Long-Stall Instructions , 2007 .