Obtaining FPGA soft error rate in high performance information systems

Abstract Soft errors due to cosmic particles are a growing reliability threat for VLSI systems. The vulnerability of FPGA-based designs to soft errors is higher than ASIC implementations since the majority of chip real estate is dedicated to memory bits, configuration bits, and user bits. Moreover, Single Event Upsets (SEUs) in the configuration bits of SRAM-based FPGAs result in permanent errors in the mapped design. FPGAs are widely used in the implementation of high performance information systems. Since the reliability requirements of these high performance information sub-systems are very stringent, the reliability of the FPGA chips used in the design of such systems plays a critical role in the overall system reliability. In this paper, we compare and validate the soft error rate of FPGA-based designs used in the Logical Unit Module board of a commercial information system with the field error rates obtained from actual field failure data. This comparison confirms that our analytical tool is very accurate (there is an 81% overlap in FIT rate range obtained with our analytical modeling framework and the field failure data studied). It can be used for identifying vulnerable modules within the FPGA for cost-effective reliability improvement.

[1]  N. Seifert,et al.  Robust system design with built-in soft-error resilience , 2005, Computer.

[2]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[3]  Mehdi Baradaran Tahoori,et al.  Case Study: Soft Error Rate Analysis in Storage Systems , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[4]  A. Lesea,et al.  The rosetta experiment: atmospheric soft error rate testing in differing technology FPGAs , 2005, IEEE Transactions on Device and Materials Reliability.

[5]  Mehdi Baradaran Tahoori,et al.  An analytical approach for soft error rate estimation in digital circuits , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[6]  E. Normand Single event upset at ground level , 1996 .

[7]  R.C. Baumann,et al.  Radiation-induced soft errors in advanced semiconductor technologies , 2005, IEEE Transactions on Device and Materials Reliability.

[8]  Mehdi Baradaran Tahoori,et al.  An accurate SER estimation method based on propagation probability [soft error rate] , 2005, Design, Automation and Test in Europe.

[9]  Joel Emer,et al.  A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[10]  Mehdi Baradaran Tahoori,et al.  Soft error mitigation for SRAM-based FPGAs , 2005, 23rd IEEE VLSI Test Symposium (VTS'05).

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  James F. Ziegler,et al.  Terrestrial cosmic rays , 1996, IBM J. Res. Dev..

[13]  M. Wirthlin,et al.  SEU-induced persistent error propagation in FPGAs , 2005, IEEE Transactions on Nuclear Science.

[14]  Johan Karlsson,et al.  Using heavy-ion radiation to validate fault-handling mechanisms , 1994, IEEE Micro.

[15]  Rudy Lauwereins,et al.  Design, Automation, and Test in Europe , 2008 .

[16]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.

[17]  Y. Yagil,et al.  A systematic approach to SER estimation and solutions , 2003, 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual..