Transistors per area unit double in every new technology node. However, the electric field density and power demand grow if Vcc is not scaled. Therefore, Vcc must be
scaled in pace with new technology nodes to prevent excessive degradation and keep power demand within reasonable limits. Unfortunately, low Vcc operation exacerbates the effect of variations and decreases noise and stability margins, increasing the likelihood of errors in SRAM memories
such as caches. Those errors translate into performance loss and performance variation across different cores, which is especially undesirable in a multi-core processor.
This paper presents (i) a novel scheme to tolerate high faulty bit rates in caches by disabling only faulty subblocks, (ii) a dynamic address remapping scheme to reduce performance variation across different cores, which is key for performance
predictability, and (iii) a comparison with state-of-the-art techniques for faulty bit tolerance in caches. Results for some typical first level data cache configurations
show 15% average performance increase and standard deviation reduction from 3.13% down to 0.55% when compared to cache line disabling schemes.