Fault tolerance and reliability in field-programmable gate arrays

Reduced device-level reliability and increased within-die process variability will become serious issues for future field-programmable gate arrays (FPGAs), and will result in faults developing dynamically during the lifetime of the integrated circuit. Fortunately, FPGAs have the ability to reconfigure in the field and at runtime, thus providing opportunities to overcome such degradation-induced faults. This study provides a comprehensive survey of fault detection methods and fault-tolerance schemes specifically for FPGAs and in the context of device degradation, with the goal of laying a strong foundation for future research in this field. All methods and schemes are quantitatively compared and some particularly promising approaches are highlighted.

[1]  Peter Y. K. Cheung,et al.  Fault tolerant methods for reliability in FPGAs , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[2]  Peter Y. K. Cheung,et al.  Self-characterization of Combinatorial Circuit Delays in FPGAs , 2007, 2007 International Conference on Field-Programmable Technology.

[3]  A. Bravaix,et al.  The Energy-Driven Hot-Carrier Degradation Modes of nMOSFETs , 2007, IEEE Transactions on Device and Materials Reliability.

[4]  Charles E. Stroud,et al.  Online Fault Tolerance for FPGA Logic Blocks , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Z. Navabi,et al.  An Optimum ORA BIST for Multiple Fault FPGA Look-Up Table Testing , 2006, 2006 15th Asian Test Symposium.

[6]  Masanori Hariyama,et al.  A Multi-Context FPGA Using Floating-Gate-MOS Functional Pass-Gates , 2006, IEICE Trans. Electron..

[7]  Peter Y. K. Cheung,et al.  Reconfiguration and Fine-Grained Redundancy for Fault Tolerance in FPGAs , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[8]  Narayanan Vijaykrishnan,et al.  FLAW: FPGA lifetime awareness , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[9]  J.D. Lohn,et al.  Evolutionary based techniques for fault tolerant field programmable gate arrays , 2006, 2nd IEEE International Conference on Space Mission Challenges for Information Technology (SMC-IT'06).

[10]  M. Berg,et al.  Fault tolerance implementation within SRAM based FPGA designs based upon the increased level of single event upset susceptibility , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).

[11]  Tian Xia,et al.  An Automated BIST Architecture for Testing and Diagnosing FPGA Interconnect Faults , 2006, J. Electron. Test..

[12]  Andrzej Krasniewski,et al.  Low-Cost Concurrent Error Detection for FSMs Implemented Using Embedded Memory Blocks of FPGAs , 2006, 2006 IEEE Design and Diagnostics of Electronic Circuits and systems.

[13]  Mehdi Baradaran Tahoori,et al.  Fault tolerance of switch blocks and switch block arrays in FPGA , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[14]  Ronald F. DeMara,et al.  Autonomous FPGA fault handling through competitive runtime reconfiguration , 2005, 2005 NASA/DoD Conference on Evolvable Hardware (EH'05).

[15]  Peter Y. K. Cheung,et al.  Analysis of yield loss due to random photolithographic defects in the interconnect structure of FPGAs , 2005, FPGA '05.

[16]  K. Cheung Can TDDB continue to serve as reliability test method for advance gate dielectric? , 2004, 2004 International Conference on Integrated Circuit Design and Technology (IEEE Cat. No.04EX866).

[17]  Peter Y. K. Cheung,et al.  BIST Based Interconnect Fault Location for FPGAs , 2004, FPL.

[18]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[19]  Patrick Girard,et al.  High quality TPG for delay faults in look-up tables of FPGAs , 2004, Proceedings. DELTA 2004. Second IEEE International Workshop on Electronic Design, Test and Applications.

[20]  A. P. Shanthi,et al.  Exploring FPGA structures for evolving fault tolerant hardware , 2003, NASA/DoD Conference on Evolvable Hardware, 2003. Proceedings..

[21]  Patrick Girard,et al.  Defect analysis for delay-fault BIST in FPGAs , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..

[22]  D. Schroder,et al.  Negative bias temperature instability: Road to cross in deep submicron silicon semiconductor manufacturing , 2003 .

[23]  Luigi Carro,et al.  Designing fault tolerant systems into SRAM-based FPGAs , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[24]  S. Simmons,et al.  BIST-diagnosis of interconnect fault locations in FPGA's , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[25]  Alex K. Jones,et al.  Synthetic circuit generation using clustering and iteration , 2003, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Kei Hiraki,et al.  Highly fault-tolerant FPGA processor by degrading strategy , 2002, 2002 Pacific Rim International Symposium on Dependable Computing, 2002. Proceedings..

[27]  Mehdi Baradaran Tahoori,et al.  Diagnosis of open defects in FPGA interconnect , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..

[28]  Russell Tessier,et al.  Testing and diagnosis of interconnect faults in cluster-based FPGA architectures , 2002, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[29]  Shyue-Kung Lu,et al.  Fault detection and fault diagnosis techniques for lookup table FPGAs , 2002, Proceedings of the 11th Asian Test Symposium, 2002. (ATS '02)..

[30]  Donatella Sciuto,et al.  Designing self-checking FPGAs through error detection codes , 2002, 17th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2002. DFT 2002. Proceedings..

[31]  Luca Selmi,et al.  On interface and oxide degradation in VLSI MOSFETs. I. Deuterium effect in CHE stress regime , 2002 .

[32]  Luca Selmi,et al.  On interface and oxide degradation in VLSI MOSFETs. II. Fowler-Nordheim stress regime , 2002 .

[33]  Charles E. Stroud,et al.  BIST-Based Delay-Fault Testing in FPGAs , 2002, Proceedings of the Eighth IEEE International On-Line Testing Workshop (IOLTW 2002).

[34]  Sergio D'Angelo,et al.  A fault-tolerant FPGA-based multi-stage interconnection network for space applications , 2002, Proceedings First IEEE International Workshop on Electronic Design, Test and Applications '2002.

[35]  Edward J. McCluskey,et al.  Fast run-time fault location in dependable FPGA-based applications , 2001, Proceedings 2001 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[36]  Charles E. Stroud,et al.  Roving STARs: an integrated approach to on-line testing, diagnosis, and fault tolerance for FPGAs in adaptive computing systems , 2001, Proceedings Third NASA/DoD Workshop on Evolvable Hardware. EH-2001.

[37]  Abderrahim Doumar,et al.  Testing approach within FPGA-based fault tolerant systems , 2000, Proceedings of the Ninth Asian Test Symposium.

[38]  Dinesh K. Bhatia,et al.  A Fault Tolerant Technique for FPGAs , 2000, J. Electron. Test..

[39]  Edward J. McCluskey,et al.  Which concurrent error detection scheme to choose ? , 2000, Proceedings International Test Conference 2000 (IEEE Cat. No.00CH37159).

[40]  Jian Xu,et al.  Novel technique for built-in self-test of FPGA interconnects , 2000, Proceedings International Test Conference 2000 (IEEE Cat. No.00CH37159).

[41]  Adrian Stoica,et al.  Fault-tolerant evolvable hardware using field-programmable transistor arrays , 2000, IEEE Trans. Reliab..

[42]  Miodrag Potkonjak,et al.  Enhanced FPGA reliability through efficient run-time fault reconfiguration , 2000, IEEE Trans. Reliab..

[43]  Miodrag Potkonjak,et al.  Algorithms for efficient runtime fault recovery on diverse FPGA architectures , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[44]  Cecilia Metra,et al.  Transient and permanent fault diagnosis for FPGA-based TMR systems , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[45]  Andreas Steininger,et al.  On the necessity of on-line-BIST in safety-critical applications-a case-study , 1999, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352).

[46]  Kozo Kinoshita,et al.  Built-in self-test for multiple CLB faults of a LUT type FPGA , 1998, Proceedings Seventh Asian Test Symposium (ATS'98) (Cat. No.98TB100259).

[47]  Miodrag Potkonjak,et al.  On-line fault detection for bus-based field programmable gate arrays , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[48]  Cecilia Metra,et al.  Fault-tolerant voting mechanism and recovery scheme for TMR FPGA-based systems , 1998, Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223).

[49]  Miodrag Potkonjak,et al.  Low overhead fault-tolerant FPGA systems , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[50]  E. Normand Single event upset at ground level , 1996 .

[51]  Mariagiovanna Sami,et al.  KITE: a behavioural approach to fault-tolerance in FPGA-based systems , 1996, Proceedings. 1996 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[52]  Eiji Fujiwara,et al.  Probability to Achieve TSC Goal , 1996, IEEE Trans. Computers.

[53]  Shantanu Dutt,et al.  Node-covering based defect and fault tolerance methods for increased yield in FPGAs , 1996, Proceedings of 9th International Conference on VLSI Design.

[54]  Peter A. Ivey,et al.  Defect tolerant SRAM based FPGAs , 1994, Proceedings 1994 IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[55]  Kazuo Nakajima,et al.  Yield enhancement of programmable ASIC arrays by reconfiguration of circuit placements , 1994, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[56]  Andrew M. Tyrrell,et al.  The yield enhancement of field-programmable gate arrays , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[57]  Yasuo Kawahara,et al.  Introducing redundancy in field programmable gate arrays , 1993, Proceedings of IEEE Custom Integrated Circuits Conference - CICC '93.

[58]  A. K. Ray,et al.  Electromigration―a tutorial introduction , 1990 .

[59]  Dharma P. Agrawal,et al.  A Survey and Comparision of Fault-Tolerant Multistage Interconnection Networks , 1987, Computer.

[60]  C. V. Ramamoorthy,et al.  Reliability Analysis of Systems with Concurrent Error Detection , 1975, IEEE Transactions on Computers.

[61]  Erik Chmelar,et al.  FPGA Interconnect Delay Fault Testing , 2003, ITC.

[62]  Carl Carmichael,et al.  Triple Module Redundancy Design Techniques for Virtex FPGAs, Application Note 197 , 2001 .

[63]  Charles E. Stroud,et al.  Built-In Self-Test of Logic Blocks in FPGAs , 2000 .

[64]  J. von Neumann,et al.  Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .