Modeling the impact of permanent faults in caches

The traditional performance cost benefits we have enjoyed for decades from technology scaling are challenged by several critical constraints including reliability. Increases in static and dynamic variations are leading to higher probability of parametric and wear-out failures and are elevating reliability into a prime design constraint. In particular, SRAM cells used to build caches that dominate the processor area are usually minimum sized and more prone to failure. It is therefore of paramount importance to develop effective methodologies that facilitate the exploration of reliability techniques for caches. To this end, we present an analytical model that can determine for a given cache configuration, address trace, and random probability of permanent cell failure the exact expected miss rate and its standard deviation when blocks with faulty bits are disabled. What distinguishes our model is that it is fully analytical, it avoids the use of fault maps, and yet, it is both exact and simpler than previous approaches. The analytical model is used to produce the miss-rate trends (expected miss-rate) for future technology nodes for both uncorrelated and clustered faults. Some of the key findings based on the proposed model are (i) block disabling has a negligible impact on the expected miss-rate unless probability of failure is equal or greater than 2.6e-4, (ii) the fault map methodology can accurately calculate the expected miss-rate as long as 1,000 to 10,000 fault maps are used, and (iii) the expected miss-rate for execution of parallel applications increases with the number of threads and is more pronounced for a given probability of failure as compared to sequential execution.

[1]  S. B. Yao,et al.  Approximating block accesses in database organizations , 1977, CACM.

[2]  David A. Patterson,et al.  Architecture of a VLSI instruction cache for a RISC , 1983, ISCA '83.

[3]  C.H. Stapper,et al.  Integrated circuit yield statistics , 1983, Proceedings of the IEEE.

[4]  Alan Jay Smith,et al.  Evaluating Associativity in CPU Caches , 1989, IEEE Trans. Computers.

[5]  Gurindar S. Sohi Cache Memory Organization to Enhance the Yield of High-Performance VLSI Processors , 1989, IEEE Trans. Computers.

[6]  Mark Horowitz,et al.  An analytical cache model , 1989, TOCS.

[7]  Mark D. Hill,et al.  Performance Implications of Tolerating Cache Faults , 1993, IEEE Trans. Computers.

[8]  Israel Koren,et al.  A Unified Negative-Binomial Distribution for Yield Analysis of Defect-Tolerant Circuits , 1993, IEEE Trans. Computers.

[9]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[10]  Edward J. McCluskey,et al.  PADded cache: a new fault-tolerance technique for cache memories , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[11]  Shekhar Y. Borkar,et al.  Design challenges of technology scaling , 1999, IEEE Micro.

[12]  James D. Meindl,et al.  Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration , 2002, IEEE J. Solid State Circuits.

[13]  Yuan Taur,et al.  CMOS design near the limit of scaling , 2002 .

[14]  David J. Frank,et al.  Power-constrained CMOS scaling limits , 2002, IBM J. Res. Dev..

[15]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[16]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[17]  M. Yamaoka,et al.  Low power SRAM menu for SOC application using Yin-Yang-feedback memory cell technology , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[18]  N. Vallepalli,et al.  SRAM design on 65nm CMOS technology with integrated leakage reduction scheme , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[19]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[20]  Tohru Ishihara,et al.  A cache-defect-aware code placement algorithm for improving the performance of processors , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[21]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[22]  James Tschanz,et al.  Impact of Parameter Variations on Circuits and Microarchitecture , 2006, IEEE Micro.

[23]  Sani R. Nassif,et al.  Statistical analysis of SRAM cell stability , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[24]  Eric M. Schwarz,et al.  IBM POWER6 microarchitecture , 2007, IBM J. Res. Dev..

[25]  Keith A. Bowman,et al.  Impact of die-to-die and within-die parameter variations on the throughput distribution of multi-core processors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[26]  T. Mudge,et al.  On-Chip Cache Device Scaling Limits and Effective Fault Repair Techniques in Future Nanoscale Technology , 2007, 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools (DSD 2007).

[27]  Hyunjin Lee,et al.  Performance of Graceful Degradation for Cache Faults , 2007, IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07).

[28]  Jack J. Dongarra,et al.  L2 Cache Modeling for Scientific Applications on Chip Multi-Processors , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[29]  Hyunjin Lee,et al.  Exploring the interplay of yield, area, and performance in processor caches , 2007, 2007 25th International Conference on Computer Design.

[30]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[31]  C.H. Kim,et al.  A 0.2 V, 480 kb Subthreshold SRAM With 1 k Cells Per Bitline for Ultra-Low-Voltage Computing , 2008, IEEE Journal of Solid-State Circuits.

[32]  Alaa R. Alameldeen,et al.  Trading off Cache Capacity for Reliability to Enable Low Voltage Operation , 2008, 2008 International Symposium on Computer Architecture.

[33]  A.P. Chandrakasan,et al.  A 256 kb 65 nm 8T Subthreshold SRAM Employing Sense-Amplifier Redundancy , 2008, IEEE Journal of Solid-State Circuits.

[34]  Amin Ansari,et al.  ZerehCache: Armoring cache architectures in high defect density technologies , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[35]  Yiran Chen,et al.  Tolerating process variations in large, set-associative caches: The buddy cache , 2009, TACO.

[36]  Keith A. Bowman,et al.  Circuit techniques for dynamic variation tolerance , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[37]  Costas J. Spanos,et al.  Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[38]  Yiannakis Sazeides,et al.  Performance-effective operation below Vcc-min , 2010, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS).

[39]  Sani R. Nassif,et al.  A resilience roadmap: (invited paper) , 2010, DATE 2010.

[40]  Yu Cao,et al.  A resilience roadmap , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[41]  Costas J. Spanos,et al.  Physically Justifiable Die-Level Modeling of Spatial Variation in View of Systematic Across Wafer Variability , 2011, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[42]  Hyunjin Lee,et al.  DEFCAM: A design and evaluation framework for defect-tolerant cache memories , 2011, TACO.

[43]  José M. García,et al.  An analytical model for the calculation of the Expected Miss Ratio in faulty caches , 2011, 2011 IEEE 17th International On-Line Testing Symposium.