Enhanced self-configurability and yield in multicore grids

As we move deeper in the nanotechnology era, computer architecture is solicited to manipulate tremendous numbers of devices per chip with high defect densities. These trends provide new computing opportunities but efficiently exploiting them will require a shift towards novel, highly parallel architectures. Fault tolerant mechanisms will have to be integrated to the design to deal with the low yield of future nanofabrication processes. In this paper we consider multi processor grid (MPG) architectures that assure scalability beyond hundreds of cores per chip. We study self-diagnosis and self-configuration methods at the architectural level and propose an enhanced self-configuration methodology that enables usage of a maximum percentage of available fault-free cores in MPGs with high defect densities. We show that our approach achieves usability of all fault-free cores for the case of fault-free routers whereas previous work was efficient for defect densities of up to 20–25% of defective cores. We also address the case of faulty routers, achieving usability of almost all fault-free nodes (fault-free cores having a fault-free router) for very high defect densities both in the cores and in the routers.

[1]  David S. Johnson,et al.  Near-optimal bin packing algorithms , 1973 .

[2]  Michael Nicolaidis,et al.  Computational Opportunities and CAD for Nanotechnologies , 2010 .

[3]  Sujit Dey,et al.  Software-based self-testing methodology for processor cores , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[4]  Jian Shen,et al.  Native mode functional test generation for processors with applications to self test and design validation , 1998, Proceedings International Test Conference 1998 (IEEE Cat. No.98CH36270).

[5]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[6]  Michael Nicolaidis,et al.  Towards a holistic CAD platform for nanotechnologies , 2008, Microelectron. J..

[7]  Davide Bertozzi,et al.  Supporting Task Migration in Multi-Processor Systems-on-Chip: A Feasibility Study , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[8]  École Doctorale Fault Tolerance through Self-configuration in the Future Nanoscale Multiprocessors , 2008 .

[9]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[10]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[11]  Jacques Henri Collet,et al.  Self-Configuration and Reachability Metrics in Massively Defective Multiport Chips , 2008, 2008 14th IEEE International On-Line Testing Symposium.

[12]  C. Siva Ram Murthy,et al.  An Efficient Dynamic Scheduling Algorithm For Multiprocessor Real-Time Systems , 1998, IEEE Trans. Parallel Distributed Syst..

[13]  Flávio Rech Wagner,et al.  Dynamic Task Allocation Strategies in MPSoC for Soft Real-time Applications , 2008, 2008 Design, Automation and Test in Europe.

[14]  C. Siva Ram Murthy,et al.  A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis , 1998, IEEE Trans. Parallel Distributed Syst..

[15]  Dimitris Gizopoulos,et al.  Effective software-based self-test strategies for on-line periodic testing of embedded processors , 2005 .

[16]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .