Customizable Fault Tolerant Caches for Embedded Processors

The continuing divergence of processor and memory speeds has led to the increasing reliance on larger caches which have become major consumers of area and power in embedded processors. Concurrently, intra-die and inter-die process variation at future technology nodes will cause defect-free yield to drop sharply unless mitigated. This paper focuses on an architectural technique to configure cache designs to be resilient to memory cell failures brought on by the effects of process variation. Profile-driven re-mapping of memory lines to cache lines is proposed to tolerate failures while minimizing degradation in average memory access time (AMAT) and thereby significantly boosting performance-based die yield beyond that which can be achieved with current techniques. For example, with 50% of the number of cache lines faulty, the performance drop quantified by increase in AMAT using our technique is 12.5% compared to 60% increase in AMAT using existing techniques.

[1]  Kaushik Roy,et al.  Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[2]  David A. Patterson,et al.  Architecture of a VLSI instruction cache for a RISC , 1983, ISCA '83.

[3]  Paul R. Turgeon,et al.  Two approaches to array fault tolerance in the IBM Enterprise System/9000 Type 9121 processor , 1991, IBM J. Res. Dev..

[4]  Keith A. Bowman,et al.  Variation-tolerant circuits: circuit solutions and techniques , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[5]  Hamid Sarbazi-Azad,et al.  Fault detection enhancement in cache memories using a high performance placement algorithm , 2004, Proceedings. 10th IEEE International On-Line Testing Symposium.

[6]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[7]  Kaushik Roy,et al.  A soft error monitor using switching current detection , 2005, 2005 International Conference on Computer Design.

[8]  Edward J. McCluskey,et al.  PADded cache: a new fault-tolerance technique for cache memories , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[9]  Dimitri Antoniadis,et al.  Impact of using adaptive body bias to compensate die-to-die Vt variation on within-die Vt variation , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[10]  Kaushik Roy,et al.  A novel fault tolerant cache to improve yield in nanometer technologies , 2004, Proceedings. 10th IEEE International On-Line Testing Symposium.

[11]  Krishna V. Palem,et al.  Data remapping for design space optimization of embedded memory systems , 2003, TECS.

[12]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[13]  Y. Ooi,et al.  Fault-tolerant architecture in a cache memory control LSI , 1992 .

[14]  David Blaauw,et al.  Razor: A Low-Power Pipeline Based on Circuit-Level Timing Speculation , 2003, MICRO.

[15]  Peter Petrov,et al.  Towards effective embedded processors in codesigns: customizable partitioned caches , 2001, Ninth International Symposium on Hardware/Software Codesign. CODES 2001 (IEEE Cat. No.01TH8571).

[16]  Krste Asanovic,et al.  Fine-grain CAM-tag cache resizing using miss tags , 2002, ISLPED '02.

[17]  Brad Calder,et al.  Discovering and Exploiting Program Phases , 2003, IEEE Micro.

[18]  Thomas M. Conte,et al.  A Case for Exploiting Memory-Access Persistence , 2001 .

[19]  David A. Patterson,et al.  Computer Architecture - A Quantitative Approach, 5th Edition , 1996 .

[20]  Haridimos T. Vergos,et al.  Performance recovery in direct-mapped faulty caches via the use of a very small fully associative spare cache , 1995, Proceedings of 1995 IEEE International Computer Performance and Dependability Symposium.

[21]  Ping Yang,et al.  Parametric yield optimization for MOS circuit blocks , 1988, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[22]  Vivek De,et al.  Intrinsic MOSFET parameter fluctuations due to random dopant placement , 1997, IEEE Trans. Very Large Scale Integr. Syst..

[23]  Howard Leo Kalter,et al.  A 50-ns 16-Mb DRAM with a 10-ns data rate and on-chip ECC , 1990 .