Flexible Spare Core Placement in Torus Topology Based NoCs and Its Validation on an FPGA

In the nano-scale era, Network-on-Chip (NoC) interconnection paradigm has gained importance to abide by the communication challenges in Chip Multi-Processors (CMPs). With increased integration density on CMPs, NoC components namely cores, routers, and links are susceptible to failures. Therefore, to improve system reliability, there is a need for efficient fault-tolerant techniques that mitigate permanent faults in NoC based CMPs. There exists several fault-tolerant techniques that address the permanent faults in application cores while placing the spare cores onto NoC topologies. However, these techniques are limited to Mesh topology based NoCs. There are few approaches that have realized the fault-tolerant solutions on an FPGA, but the study on architectural aspects of NoC is limited. This paper presents the flexible placement of spare core onto Torus topology-based NoC design by considering core faults and validating it on an FPGA. In the first phase, a mathematical formulation based on Integer Linear Programming (ILP) and meta-heuristic based Particle Swarm Optimization (PSO) have been proposed for the placement of spare core. In the second phase, we have implemented NoC router addressing scheme, routing algorithm, run-time fault injection model, and fault-tolerant placement of spare core onto Torus topology using an FPGA. Experiments have been done by taking different multimedia and synthetic application benchmarks. This has been done in both static and dynamic simulation environments followed by hardware implementation. In the static simulation environment, the experimentations are carried out by scaling the network size and router faults in the network. The results obtained from our approach outperform the methods such as Fault-tolerant Spare Core Mapping (FSCM), Simulated Annealing (SA), and Genetic Algorithm (GA) proposed in the literature. For the experiments carried out by scaling the network size, our proposed methodology shows an average improvement of 18.83%, 4.55%, 12.12% in communication cost over the approaches FSCM, SA, and GA, respectively. For the experiments carried out by scaling the router faults in the network, our approach shows an improvement of 34.27%, 26.26%, and 30.41% over the approaches FSCM, SA, and GA, respectively. For the dynamic simulations, our approach shows an average improvement of 5.67%, 0.44%, and 3.69%, over the approaches FSCM, SA, and GA, respectively. In the hardware implementation, our approach shows an average improvement of 5.38%, 7.45%, 27.10% in terms of application runtime over the approaches SA, GA, and FSCM, respectively. This shows the superiority of the proposed approach over the approaches presented in the literature.

[1]  Naresh Kumar Reddy Beechu,et al.  An energy-efficient fault-aware core mapping in mesh-based network on chip systems , 2017, J. Netw. Comput. Appl..

[2]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[3]  Santanu Chattopadhyay,et al.  A survey on application mapping strategies for Network-on-Chip design , 2013, J. Syst. Archit..

[4]  Sung Won Kim,et al.  Performance Evaluation of Application Mapping Approaches for Network-on-Chip Designs , 2020, IEEE Access.

[5]  Linga Reddy Cenkeramaddi,et al.  Torus Topology based Fault-Tolerant Network-on-Chip Design with Flexible Spare Core Placement , 2018, 2018 14th Conference on Ph.D. Research in Microelectronics and Electronics (PRIME).

[6]  Axel Jantsch,et al.  Methods for fault tolerance in networks-on-chip , 2013, CSUR.

[7]  Naresh Kumar Reddy Beechu,et al.  High-performance and energy-efficient fault-tolerance core mapping in NoC , 2017, Sustain. Comput. Informatics Syst..

[8]  Hamid R. Zarandi,et al.  A Fault-Tolerant Low-Energy Multi-Application Mapping onto NoC-based Multiprocessors , 2012, 2012 IEEE 15th International Conference on Computational Science and Engineering.

[9]  Mouloud Koudil,et al.  A survey on fault-tolerant application mapping techniques for Network-on-Chip , 2019, J. Syst. Archit..

[10]  Radu Marculescu,et al.  FARM: Fault-aware resource management in NoC-based multiprocessor platforms , 2011, 2011 Design, Automation & Test in Europe.

[11]  Antonio Robles,et al.  An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori , 2004, IEEE Computer Architecture Letters.

[12]  Santanu Chattopadhyay,et al.  Design and evaluation of Mesh-of-Tree based Network-on-Chip using virtual channel router , 2012, Microprocess. Microsystems.

[13]  Hamid R. Zarandi,et al.  A fault-aware low-energy spare core allocation in networks-on-chip , 2012, NORCHIP 2012.

[14]  Juan Fang,et al.  DI_GA: A Heuristic Mapping Algorithm for Heterogeneous Network-on-Chip , 2019 .

[15]  Kumar Y. B. Nithin,et al.  A Gracefully Degrading and Energy-Efficient Fault Tolerant NoC Using Spare Core , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[16]  Wilm E. Donath Complexity Theory and Design Automation , 1980, 17th Design Automation Conference.

[17]  Wayne H. Wolf,et al.  TGFF: task graphs for free , 1998, Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE'98).

[18]  Muhammad Rashid,et al.  EFIC-ME: A Fast Emulation Based Fault Injection Control and Monitoring Enhancement , 2020, IEEE Access.

[19]  Naresh Kumar Reddy Beechu,et al.  Hardware implementation of fault tolerance NoC core mapping , 2018, Telecommun. Syst..

[20]  Fernando Gehm Moraes,et al.  HERMES: an infrastructure for low area overhead packet-switching networks on chip , 2004, Integr..

[21]  Hamid R. Zarandi,et al.  A fault-tolerant core mapping technique in networks-on-chip , 2013, IET Comput. Digit. Tech..

[22]  Lennart Bamberg,et al.  Application-Specific SoC Design Using Core Mapping to 3D Mesh NoCs with Nonlinear Area Optimization and Simulated Annealing , 2020 .

[23]  Andrew B. Kahng,et al.  ORION 2.0: A Power-Area Simulator for Interconnection Networks , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Linga Reddy Cenkeramaddi,et al.  Torus Topology based Fault-Tolerant Network-on-Chip Design with Flexible Spare Core Placement , 2018, 2018 14th Conference on Ph.D. Research in Microelectronics and Electronics (PRIME).

[25]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.