Using tag-match comparators for detecting soft errors

Soft errors caused by high energy particle strikes are becoming an increasingly important problem in microprocessor design. With increasing transistor density and die sizes, soft errors are expected to be a larger problem in the near future. Recovering from these unexpected faults may be possible by reexecuting some part of the program only if the error can be detected. Therefore it is important to come up with new techniques to detect soft errors and increase the number of errors that are detected. Modern microprocessors employ out-of-order execution and dynamic scheduling logic. Comparator circuits, which are used to keep track of data dependencies, are usually idle. In this paper, we propose various schemes to exploit on-chip comparators to detect transient faults. Our results show that around 50% of the errors on the wakeup logic can be detected with minimal hardware overhead by using the proposed techniques.

[1]  Wei Zhang,et al.  ICR: in-cache replication for enhancing data cache reliability , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[2]  Robert Baumann,et al.  Soft errors in advanced computer systems , 2005, IEEE Design & Test of Computers.

[3]  Todd M. Austin,et al.  Efficient dynamic scheduling through tag elimination , 2002, ISCA.

[4]  Joel S. Emer,et al.  Techniques to reduce the soft error rate of a high-performance microprocessor , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[5]  Todd M. Austin,et al.  A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.

[6]  Kanad Ghose,et al.  Instruction packing: Toward fast and energy-efficient instruction scheduling , 2006, TACO.

[7]  Sanjay J. Patel,et al.  ReStore: symptom based soft error detection in microprocessors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[8]  Gürhan Küçük,et al.  Energy efficient comparators for superscalar datapaths , 2004, IEEE Transactions on Computers.

[9]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[10]  Shuai Wang,et al.  In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[11]  EmerJoel,et al.  Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor , 2004 .

[12]  Xiaodong Li,et al.  SoftArch: an architecture-level tool for modeling and analyzing soft errors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[13]  Mahmut T. Kandemir,et al.  Increasing register file immunity to transient errors , 2005, Design, Automation and Test in Europe.

[14]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[15]  Osman S. Unsal,et al.  Exploiting Narrow Values for Soft Error Tolerance , 2006, IEEE Computer Architecture Letters.