Automated minimization of concurrent online checkers for Network-on-Chips

The paper introduces automated minimization of a set of concurrent online checkers for Network-on-Chips (NoCs) under given fault detection quality constraints. The proposed framework allows accurate and complete evaluation of the fault detection capabilities of checkers, which in turn enables finding seamless trade-offs between the overhead area of the checkers and the fault detection quality. The features of the automated minimization approach include formal proof for the absence or presence of true misses in checkers and a minimal fault detection latency. The minimization technique is based on a divide-and-conquer approach of partitioning the checkers' fault table into independent clusters. The checkers within the cluster are weighted and the set of checkers is minimized based on a heuristic method. Experiments on the control part (routing and arbitration) of an NoC router show that 100% fault coverage with very low overhead area will be achieved by the proposed minimization approach.

[1]  Yiorgos Makris,et al.  Concurrent fault detection in random combinational logic , 2003, Fourth International Symposium on Quality Electronic Design, 2003. Proceedings..

[2]  Jay M. Berger A Note on Error Detection Codes for Asymmetric Channels , 1961, Inf. Control..

[3]  R. Iris Bahar,et al.  Enhancing online error detection through area-efficient multi-site implications , 2011, 29th VLSI Test Symposium.

[4]  Raimund Ubar,et al.  Parallel X-fault simulation with critical path tracing technique , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[5]  Artur Jutman,et al.  Turbo Tester – diagnostic package for research and training , 2003 .

[6]  Nur A. Touba,et al.  Synthesis of low-cost parity-based partially self-checking circuits , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..

[7]  Paul Ampadu,et al.  Transient and Permanent Error Control for High-End Multiprocessor Systems-on-Chip , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[8]  Chrysostomos Nicopoulos,et al.  NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[9]  Armin Alaghi,et al.  Online NoC Switch Fault Detection and Diagnosis Using a High Level Fault Model , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[10]  José Duato,et al.  Logic-Based Distributed Routing for NoCs , 2008, IEEE Computer Architecture Letters.

[11]  Nur A. Touba,et al.  Synthesis of low power CED circuits based on parity codes , 2005, 23rd IEEE VLSI Test Symposium (VTS'05).

[12]  S. Katkoori,et al.  Selective triple Modular redundancy (STMR) based single-event upset (SEU) tolerant synthesis for FPGAs , 2004, IEEE Transactions on Nuclear Science.

[13]  R. Ubar,et al.  Structurally synthesized binary decision diagrams , 2004 .

[14]  Kewal K. Saluja,et al.  An implementation and analysis of a concurrent built-in self-test technique , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[15]  Valeria Bertacco,et al.  Formally enhanced runtime verification to ensure NoC functional correctness , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Zeljko Zilic,et al.  Assertion Checkers in Verification, Silicon Debug and In-Field Diagnosis , 2007, 8th International Symposium on Quality Electronic Design (ISQED'07).