At-Speed Distributed Functional Testing to Detect Logic and Delay Faults in NoCs

In this work, we propose a distributed functional test mechanism for NoCs which scales to large-scale networks with general topologies and routing algorithms. Each router and its links are tested using neighbors in different phases. The router under test is in test mode while all other parts of the NoC are operational. We use triple module redundancy (TMR) for the robustness of all testing components that are added into the switch. Experimental results show that our functional test approach can detect stuck-at, short and delay faults in the routers and links. Our approach achieves 100 percent stuck-at fault coverage for the data path and 85 percent for the control paths including routing logic, FIFO's control path, and the arbiter of a 5 × 5 router. We also show that our approach is able to detect delay faults in critical control and data paths. Synthesis results show that the area overhead of our test components with TMR support is 20 percent for covering stuck-at, delay, and short-wire faults and 7 percent for covering only stuck-at and delay faults in the 5 × 5 router. Simulation results show that our online testing approach has an average latency overhead of 3 percent in PARSEC traffic benchmarks on an 8 × 8 NoC.

[1]  Janak H. Patel,et al.  Testing of critical paths for delay faults , 2001, Proceedings International Test Conference 2001 (Cat. No.01CH37260).

[2]  Sujit Dey,et al.  Fault modeling and simulation for crosstalk in system-on-chip interconnects , 1999, 1999 IEEE/ACM International Conference on Computer-Aided Design. Digest of Technical Papers (Cat. No.99CH37051).

[3]  Richard B. Brown,et al.  A centralized supply voltage and local body bias-based compensation approach to mitigate within-die process variation , 2009, ISLPED.

[4]  Partha Pratim Pande,et al.  Methodologies and algorithms for testing switch-based NoC interconnects , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[5]  Edward T. Grochowski,et al.  Larrabee: A many-Core x86 architecture for visual computing , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[6]  R. Ubar,et al.  An External Test Approach for Network-on-a-Chip Switches , 2006, 2006 15th Asian Test Symposium.

[7]  Dhiraj K. Pradhan,et al.  Reuse-based test access and integrated test scheduling for network-on-chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[8]  Partha Pratim Pande,et al.  On-line fault detection and location for NoC interconnects , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).

[9]  An-Yeu Wu,et al.  A Scalable built-in self-test/self-diagnosis architecture for 2D-mesh based chip multiprocessor systems , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[10]  Doug A. Edwards,et al.  Adaptive stochastic routing in fault-tolerant on-chip networks , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[11]  Stephen P. Boyd,et al.  Self-Tuning for Maximized Lifetime Energy-Efficiency in the Presence of Circuit Aging , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Yuejian Wu,et al.  Testing ASICs with multiple identical cores , 2003, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13]  Luca Benini,et al.  A new physical routing approach for robust bundled signaling on NoC links , 2010, GLSVLSI '10.

[14]  Tian Ban,et al.  A simple fault-tolerant digital voter circuit in TMR nanoarchitectures , 2010, Proceedings of the 8th IEEE International NEWCAS Conference 2010.

[15]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[16]  Raimund Ubar,et al.  Off-Line Testing of Delay Faults in NoC Interconnects , 2006, 9th EUROMICRO Conference on Digital System Design (DSD'06).

[17]  Luigi Carro,et al.  Reusing an on-chip network for the test of core-based systems , 2004, TODE.

[18]  Armin Alaghi,et al.  Online NoC Switch Fault Detection and Diagnosis Using a High Level Fault Model , 2007, 22nd IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2007).

[19]  Tobias Bjerregaard,et al.  A survey of research and practices of Network-on-chip , 2006, CSUR.

[20]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[21]  J. W. McPherson,et al.  Reliability challenges for 45nm and beyond , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[22]  Federico Silla,et al.  Addressing Manufacturing Challenges with Cost-Efficient Fault Tolerant Routing , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.

[23]  Niraj K. Jha,et al.  GARNET: A detailed on-chip network model inside a full-system simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[24]  Eli Chiprout,et al.  Path coverage based functional test generation for processor marginality validation , 2010, 2010 IEEE International Test Conference.

[25]  Alexandre M. Amory,et al.  A High-Fault-Coverage Approach for the Test of Data, Control and Handshake Interconnects in Mesh Networks-on-Chip , 2008, IEEE Transactions on Computers.

[26]  Partha Pratim Pande,et al.  Design of Low power & Reliable Networks on Chip through joint crosstalk avoidance and forward error correction coding , 2006, 2006 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems.

[27]  Spyros Tragoudas,et al.  Interconnect testing for networks on chips , 2006, 24th IEEE VLSI Test Symposium.

[28]  Chouki Aktouf,et al.  A complete strategy for testing an on-chip multiprocessor architecture , 2002, IEEE Design & Test of Computers.

[29]  Luca Benini,et al.  ReliNoC: A reliable network for priority-based on-chip communication , 2011, 2011 Design, Automation & Test in Europe.

[30]  Marcelo Lubaszewski,et al.  Concurrent test of Network-on-Chip interconnects and routers , 2010, 2010 11th Latin American Test Workshop.

[31]  Luca Benini,et al.  Fine-Grained Power and Body-Bias Control for Near-Threshold Deep Sub-Micron CMOS Circuits , 2011, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[32]  L. Benini,et al.  Xpipes: a network-on-chip architecture for gigascale systems-on-chip , 2004, IEEE Circuits and Systems Magazine.

[33]  Hong Wang,et al.  Accelerating Strategy for Functional Test of NoC Communication Fabric , 2010, 2010 19th IEEE Asian Test Symposium.

[34]  Raimund Ubar,et al.  Test Configurations for Diagnosing Faulty Links in NoC Switches , 2007, 12th IEEE European Test Symposium (ETS'07).

[35]  Chita R. Das,et al.  Exploring Fault-Tolerant Network-on-Chip Architectures , 2006, International Conference on Dependable Systems and Networks (DSN'06).

[36]  Zebo Peng,et al.  Application Area Specific System Level Fault Models: A Case Study with a Simple NoC Switch , 2006 .

[37]  Antonio Robles,et al.  An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori , 2004, IEEE Computer Architecture Letters.

[38]  Marco Platzner,et al.  Design and architectures for dependable embedded systems , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[39]  Songwei Pei,et al.  A unified test architecture for on-line and off-line delay fault detections , 2011, 29th VLSI Test Symposium.

[40]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[41]  Subhasish Mitra,et al.  Robust System Design to Overcome CMOS Reliability Challenges , 2011, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[42]  David Blaauw,et al.  A highly resilient routing algorithm for fault-tolerant NoCs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[43]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[44]  Jeffrey T. Draper,et al.  Fault-Tolerant Flow Control in On-chip Networks , 2010, 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip.

[45]  Shekhar Y. Borkar,et al.  Microarchitecture and Design Challenges for Gigascale Integration , 2004, MICRO.

[46]  Alexandre M. Amory,et al.  A scalable test strategy for network-on-chip routers , 2005, IEEE International Conference on Test, 2005..

[47]  Michele Favalli,et al.  Exploiting Network-on-Chip structural redundancy for a cooperative and scalable built-in self-test architecture , 2011, 2011 Design, Automation & Test in Europe.

[48]  Larry J. Stockmeyer,et al.  A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers , 2002, IEEE Transactions on Computers.