An FPGA-Based Transient Error Simulator for Resilient Circuit and System Design and Evaluation

Error detection and correction (EDAC) has become more important with continued device scaling. We propose a field-programmable gate array (FPGA)-based simulator to accelerate the transient simulation of pipeline-level EDAC circuits and their interactions with circuits under test (CUTs). The simulator incorporates the CUT delay profile, the CUT error profile, and the EDAC model. The FPGA-based simulator captures the fine-grained interactions between the CUT and EDAC for the evaluation of the effectiveness of EDAC and its tuning. The simulator is constructed based on parameterized models, making it general purpose and widely applicable. We demonstrate the capability of this simulator in the evaluation of two popular pipeline-level EDAC designs, i.e., preedge EDAC and postedge EDAC, using synthesized processors that operate under generic error and noise models. The proposed error simulator uncovers key insights to help guide EDAC designs.

[1]  Naresh R. Shanbhag,et al.  Sequential Element Design With Built-In Soft Error Resilience , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Keith A. Bowman,et al.  A 22 nm All-Digital Dynamically Adaptive Clock Distribution for Supply Voltage Droop Tolerance , 2013, IEEE Journal of Solid-State Circuits.

[3]  David Blaauw,et al.  Razor II: In Situ Error Detection and Correction for PVT and SER Tolerance , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[4]  Robert Sims,et al.  Alpha architecture reference manual , 1992 .

[5]  C. Lopez-Ongil,et al.  Autonomous Fault Emulation: A New FPGA-Based Acceleration System for Hardness Evaluation , 2007, IEEE Transactions on Nuclear Science.

[6]  Mario García-Valderas,et al.  Soft Error Sensitivity Evaluation of Microprocessors by Multilevel Emulation-Based Fault Injection , 2012, IEEE Transactions on Computers.

[7]  Shekhar Y. Borkar,et al.  Design perspectives on 22nm CMOS and beyond , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[8]  Ming Zhang,et al.  Circuit Failure Prediction and Its Application to Transistor Aging , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[9]  David M. Bull,et al.  RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance , 2009, IEEE Journal of Solid-State Circuits.

[10]  Paolo A. Aseron,et al.  A 45 nm Resilient Microprocessor Core for Dynamic Variation Tolerance , 2011, IEEE Journal of Solid-State Circuits.

[11]  Mario García-Valderas,et al.  SET Emulation Under a Quantized Delay Model , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[12]  Chen Chang,et al.  BEE3: Revitalizing Computer Architecture Research , 2009 .

[13]  Keith A. Bowman,et al.  Resilient microprocessor design for improving performance and energy efficiency , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[14]  M.A. Aguirre,et al.  A New Approach to Estimate the Effect of Single Event Transients in Complex Circuits , 2007, IEEE Transactions on Nuclear Science.