A systematic methodology to improve yield per area of highly-parallel CMPs

Manufacturing yield of chip multi-processors (CMPs) has become a significant problem as more transistors are integrated onto a single die, and the defect rate keeps increasing for "end-of-Moore" nano-scale CMOS technologies. Since such CMP designs usually have significant structural symmetry, adding spares to these should be an effective method for increasing yield per area, as is the case for memories. However, a systematic approach to add spares to optimize CMP yield per area has never been developed, primarily due to the lack of (i) a general model of CMP architectures, and (ii) a practically-useable model for computing areas of chip versions with different numbers of spares. This paper develops such models and, in conjunction with a systematic approach for enumerating a wide range of spare configurations, uses these to compute area overhead and yield for each configuration. In particular, this paper proposes a k-way spare sharing technique to obtain optimal spare configurations which maximize yield per area of any CMP by efficiently traversing the design space for adding spares. Experimental results show significant yield per area improvements over the previous approaches and show that these benefits will continue to grow with increase in the levels of parallelism in CMPs as well as with continued technology scaling.

[1]  Johan Karlsson,et al.  On the probability of detecting data errors generated by permanent faults using time redundancy , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..

[2]  Doug Burger,et al.  Exploiting microarchitectural redundancy for defect tolerance , 2003, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[3]  Mohammad Mirza-Aghatabar,et al.  Algorithms to maximize yield and enhance yield/area of pipeline circuitry by insertion of switches and redundant modules , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[4]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[5]  Dean M. Tullsen,et al.  Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling , 2005, ISCA 2005.

[6]  Todd M. Austin,et al.  DIVA: a reliable substrate for deep submicron microarchitecture design , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[7]  Sandeep K. Gupta,et al.  A novel software-based defect-tolerance approach for application-specific embedded systems , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[8]  Karthikeyan Sankaralingam,et al.  Sampling + DMR: Practical and low-overhead permanent fault detection , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[9]  Kwang-Ting Cheng,et al.  Modeling yield, cost, and quality of an NoC with uniformly and non-uniformly distributed redundancy , 2010, 2010 28th VLSI Test Symposium (VTS).

[10]  Ramesh Karri,et al.  Improving GPU Robustness by making use of faulty parts , 2011, 2011 IEEE 29th International Conference on Computer Design (ICCD).

[11]  Sandeep K. Gupta,et al.  Salvaging chips with caches beyond repair , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[13]  Amin Ansari,et al.  The StageNet fabric for constructing resilient multicore systems , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[14]  Shekhar Y. Borkar,et al.  Thousand Core ChipsA Technology Perspective , 2007, 2007 44th ACM/IEEE Design Automation Conference.