Localization of damaged resources in NoC based shared-memory MP2SOC, using a Distributed Cooperative Configuration Infrastructure

In this paper, we present a software approach for localization of faulty components in a 2D-mesh Network-on-Chip, targeting fault tolerance in a shared memory MP2SoC architecture. We use a pre-existing and distributed hardware infrastructure supporting self-test and de-activation of the faulty components (routers and communication channels), that are transformed into “black hole”. We detail the software method used to localize these “black holes”, and centralize the information in a single point, where a modified global routing function can be defined. This embedded software makes an extensive use of a distributed fault-tolerant configuration firmware assisted by a Distributed Cooperative Configuration Infrastructure (DCCI), that is also presented. Finally, “black hole” detection and localization coverage is evaluated.

[1]  Alain Greiner,et al.  A Low Cost Network-on-Chip with Guaranteed Service Well Suited to the GALS Approach , 2006, 2006 1st International Conference on Nano-Networks and Workshops.

[2]  Steve B. Furber Living with Failure: Lessons from Nature? , 2006, ETS.

[3]  Partha Pratim Pande,et al.  Methodologies and algorithms for testing switch-based NoC interconnects , 2005, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05).

[4]  Alain Greiner,et al.  A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[5]  Raimund Ubar,et al.  Test Configurations for Diagnosing Faulty Links in NoC Switches , 2007, 12th IEEE European Test Symposium (ETS'07).

[6]  Michael Nicolaidis,et al.  Enhanced self-configurability and yield in multicore grids , 2009, 2009 15th IEEE International On-Line Testing Symposium.

[7]  Fabien Clermidy,et al.  Physical Implementation of the DSPIN Network-on-Chip in the FAUST Architecture , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[8]  Alain Greiner,et al.  Fully distributed initialization procedure for a 2D-Mesh NoC, including off-line BIST and partial deactivation of faulty components , 2010, 2010 IEEE 16th International On-Line Testing Symposium.

[9]  Yervant Zorian,et al.  Embedded Processor-Based Self-Test , 2004 .

[10]  Radu Marculescu,et al.  Towards on-chip fault-tolerant communication , 2003, ASP-DAC '03.

[11]  Spyros Tragoudas,et al.  Interconnect testing for networks on chips , 2006, 24th IEEE VLSI Test Symposium.