Reliable On-Chip Memory Design for CMPs

Aggressive technology scaling in deep sub micron regime makes chips more susceptible to failures. This causes multiple realibility challenges in the design of modern chips, including manufacturing defects, wear-out, and parametric variations. With increasing area occupied by different on-chip memories in modern computing platforms such as Chip Multi-Processors (CMPs), memory reliability becomes a challenging issue. Traditional on-chip memory reliability techniques (e.g., ECC) incur significant power and performance overheads. To tackle such challenges, my research introduces several designs for fault-tolerance of both L1 and L2 cache memories in uni-core processors [1], Last-level Cache (LLC) in CMPs [3][4], and LLC in Networks-on-Chip (NoCs) [2].

[1]  Nikil D. Dutt,et al.  REMEDIATE: A scalable fault-tolerant architecture for low-power NUCA cache in tiled CMPs , 2013, 2013 International Green Computing Conference Proceedings.

[2]  Nikil D. Dutt,et al.  FFT-Cache: A Flexible Fault-Tolerant Cache architecture for ultra low voltage operation , 2011, 2011 Proceedings of the 14th International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).

[3]  Nikil D. Dutt,et al.  A novel NoC-based design for fault-tolerance of last-level caches in CMPs , 2012, CODES+ISSS '12.