Managing SER costs of complex systems through Linear Programming

Single Event Effects negatively impact the reliability of complex electronic devices and systems. System architects, reliability engineers and digital designers have to invest considerable resources to successfully meet the reliability goals set by the final user or application. The cost of SER mitigation techniques (e.g. additional power and reduced performance) may render the product less competitive. This paper proposes an approach that allows a system architect to select the best SEE management techniques subject to given cost and performance constraints. In this methodology, the costs of SER protection (area, power, engineering effort, IP costs) are expressed as a cost function depending on the selected protection schemes. A separate function expresses the reliability and/or availability as a function of the protection schemes. Then, Linear Programming techniques are used to select a set of protection techniques that minimizes the costs, subject to the reliability constraints being met. This systematic approach enables system-architects to find a minimal-cost SER protection strategy and thus reducing over-design and unnecessary overheads.

[1]  Adrian Evans,et al.  Clustering techniques and statistical fault injection for selective mitigation of SEUs in flip-flops , 2013, International Symposium on Quality Electronic Design (ISQED).

[2]  T. Calin,et al.  Upset hardened memory design for submicron CMOS technology , 1996 .

[3]  P. Dodd,et al.  Production and propagation of single-event transients in high-speed digital logic ICs , 2004, IEEE Transactions on Nuclear Science.

[4]  Shubu Mukherjee,et al.  Architecture Design for Soft Errors , 2008 .

[5]  Hierarchical RTL-based combinatorial SER estimation , 2013, 2013 IEEE 19th International On-Line Testing Symposium (IOLTS).

[6]  Shi-Jie Wen,et al.  Design for Soft Error Resiliency in Internet Core Routers , 2009, IEEE Transactions on Nuclear Science.

[7]  Jochen A. G. Jess,et al.  Gate sizing in MOS digital circuits with linear programming , 1990, Proceedings of the European Design Automation Conference, 1990., EDAC..

[8]  Michael Nicolaidis,et al.  Soft Errors in Modern Electronic Systems , 2010 .

[9]  Nur A. Touba,et al.  Multiple Bit Upset Tolerant Memory Using a Selective Cycle Avoidance Based SEC-DED-DAEC Code , 2007, 25th IEEE VLSI Test Symposium (VTS'07).

[10]  Adrian Evans,et al.  Error detection in Ternary CAMs using Bloom Filters , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[12]  Y. Tamiya,et al.  Lp Based Cell Selection With Constraints Of Timing, Area, And Power Consumption , 1994, IEEE/ACM International Conference on Computer-Aided Design.

[13]  Miodrag Potkonjak,et al.  General Methodology for Soft-Error-Aware Power Optimization Using Gate Sizing , 2008, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[14]  Marco Ottavi,et al.  Error Detection and Correction in Content Addressable Memories by Using Bloom Filters , 2010, IEEE Transactions on Computers.

[15]  Dan Alexandrescu A comprehensive soft error analysis methodology for SoCs/ASICs memory instances , 2011, 2011 IEEE 17th International On-Line Testing Symposium.

[16]  Farid N. Najm,et al.  A family of cells to reduce the soft-error-rate in ternary-CAM , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[17]  Rina Panigrahy,et al.  Error-Correcting Codes for Ternary Content Addressable Memories , 2009, IEEE Transactions on Computers.