On-Chip Cache Device Scaling Limits and Effective Fault Repair Techniques in Future Nanoscale Technology

In this study, we investigate different cache fault tolerance techniques to determine which will be most effective when on-chip memory cell defect probabilities exceed those of current technologies, which is highly anticipated in processor on-chip caches manufactured with future nanometer scale technologies. Our most significant finding from this study is that the devices in on-chip memory cells cannot be scaled at the same rate as devices in logic circuits due to the increasing number of erroneous memory cells with voltage scaling, requiring strong fault-tolerance techniques. Second, we propose a technique to minimize performance impacts under aggressive technology and voltage scaling. It works by merging pairs of faulty cache lines to make good lines and performs better than TMR at high error rates and at lower cost. We also estimate up to 28% energy savings at low voltage, relative to a recent fault-tolerance scheme [1].

[1]  Kaushik Roy,et al.  Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  Lorena Anghel,et al.  A memory built-in self-repair for high defect densities based on error polarities , 2003, Proceedings 18th IEEE Symposium on Defect and Fault Tolerance in VLSI Systems.

[3]  L. Joiner,et al.  Decoding binary BCH codes , 1995, Proceedings IEEE Southeastcon '95. Visualize the Future.

[4]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[5]  A. J. KleinOsowski,et al.  The NanoBox project: exploring fabrics of self-correcting logic blocks for high defect rate molecular device technologies , 2004, IEEE Computer Society Annual Symposium on VLSI.

[6]  Mark D. Hill,et al.  Performance Implications of Tolerating Cache Faults , 1993, IEEE Trans. Computers.

[7]  James F. Frenzel,et al.  Defect-tolerant cache memory design , 1993, Digest of Papers Eleventh Annual 1993 IEEE VLSI Test Symposium.

[8]  Kevin Reick,et al.  Power4 System Design for High Reliability , 2002, IEEE Micro.