Novel Agent-Based Management for Fault-Tolerance in Network-on-Chip

We introduce a novel agent-based reconfiguring concept for futures network-on-chip (NoC) systems. The necessary properties to increase architecture level fault tolerance are introduced. The system control is modeled as multi-level agent hierarchy that is able to increase application fault-tolerance and performance with autonomous reactions of agents. The agent technology adds a system level intelligence level to the traditional NoC system design. The architecture and functions of this system are described on conceptual level. Communication and reconfiguring data flows are presented as study cases. Principles of reconfiguration of a NoC on faulty environment are demonstrated and simulated. Probability of reconfiguration success is measured with different latency requirements and amount of redundancy by Monte Carlo simulations. The effect of network topology in reconfiguration of a faulty mesh was also under research in the simulations.

[1]  Markus Hannebauer,et al.  Autonomous Dynamic Reconfiguration in Multi-Agent Systems , 2002, Lecture Notes in Computer Science.

[2]  Renato Stefanelli,et al.  Reconfigurable architectures for VLSI processing arrays , 1986 .

[3]  José A. B. Fortes,et al.  A taxonomy of reconfiguration techniques for fault-tolerant processor arrays , 1990, Computer.

[4]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[5]  Luca Benini,et al.  Networks on chips - technology and tools , 2006, The Morgan Kaufmann series in systems on silicon.

[6]  Sau-Gee Chen,et al.  Yield and performance issues in fault-tolerant WSI array architectures , 1995, Proceedings IEEE International Conference on Wafer Scale Integration (ICWSI).

[7]  Maurizio Valle Smart Adaptive Systems on Silicon , 2004 .

[8]  J. Isoaho,et al.  Fault-tolerant Routing Approach for Reconfigurable Networks-on-Chip , 2006, 2006 International Symposium on System-on-Chip.

[9]  Manoj Franklin,et al.  Scalability aspects of instruction distribution algorithms for clustered processors , 2001, IEEE Transactions on Parallel and Distributed Systems.

[10]  R.C. Aubusson,et al.  Wafer Scale Integration: A New Approach , 1977, ESSCIRC '77: 3rd European Solid State Circuits Conference.

[11]  Hannu Tenhunen,et al.  Agent-Based Reconfigurability for Fault-Tolerance in Network-on-Chip , 2007, ERSA.

[12]  Kees G. W. Goossens,et al.  Guaranteeing the Quality of Services in Networks on Chip , 2003, Networks on Chip.

[13]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[14]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.