Mitigating the impact of faults in unreliable memories for error-resilient applications

Inherently error-resilient applications in areas such as signal processing, machine learning and data analytics provide opportunities for relaxing reliability requirements, and thereby reducing the overhead incurred by conventional error correction schemes. In this paper, we exploit the tolerable imprecision of such applications by designing an energy-efficient fault-mitigation scheme for unreliable data memories to meet target yield. The proposed approach uses a bit-shuffling mechanism to isolate faults into bit locations with lower significance. This skews the bit-error distribution towards the low order bits, substantially limiting the output error magnitude. By controlling the granularity of the shuffling, the proposed technique enables trading-off quality for power, area, and timing overhead. Compared to error-correction codes, this can reduce the overhead by as much as 83% in read power, 77% in read access time, and 89% in area, when applied to various data mining applications in 28nm process technology.

[1]  Muhammad Shafique,et al.  Multi-layer dependability: From microarchitecture to application level , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Kaushik Roy,et al.  Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[3]  Puneet Gupta,et al.  Power / capacity scaling: Energy savings with simple fault-tolerant caches , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Paulo Cortez,et al.  Modeling wine preferences by data mining from physicochemical properties , 2009, Decis. Support Syst..

[5]  Chaitali Chakrabarti,et al.  Techniques for Compensating Memory Errors in JPEG2000 , 2013, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Jongsun Park,et al.  Priority Based Error Correction Code (ECC) for the Embedded SRAM Memories in H.264 System , 2013, Journal of Signal Processing Systems.

[7]  David Blaauw,et al.  13.8 A 32kb SRAM for error-free and error-tolerant applications with dynamic energy-quality management in 28nm CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[8]  Nam Sung Kim,et al.  Minimizing total area of low-voltage SRAM arrays through joint optimization of cell size, redundancy, and ECC , 2010, 2010 IEEE International Conference on Computer Design.

[9]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Cecilia Metra,et al.  Error correcting code analysis for cache memory high reliability and performance , 2011, 2011 Design, Automation & Test in Europe.

[11]  Lara Dolecek,et al.  Underdesigned and Opportunistic Computing in Presence of Hardware Variability , 2013, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Isabelle Guyon,et al.  Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark , 2007, Pattern Recognit. Lett..

[13]  Norbert Wehn,et al.  A Cross-Layer Technology-Based Study of How Memory Errors Impact System Resilience , 2013, IEEE Micro.

[14]  Swarup Bhunia,et al.  Low-Power Variation-Tolerant Design in Nanometer Silicon , 2011 .

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  J. Lucas,et al.  Sparkk : Quality-Scalable Approximate Storage in DRAM , 2014 .

[17]  Jacob Nelson,et al.  Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[18]  Christoph Roth,et al.  On the exploitation of the inherent error resilience of wireless systems under unreliable silicon , 2012, DAC Design Automation Conference 2012.

[19]  Andreas Peter Burg,et al.  Energy versus data integrity trade-offs in embedded high-density logic compatible dynamic memories , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[20]  Petia Radeva,et al.  Personalization and user verification in wearable systems using biometric walking patterns , 2011, Personal and Ubiquitous Computing.

[21]  Kazuya Masu,et al.  Robust importance sampling for efficient SRAM yield analysis , 2010, 2010 11th International Symposium on Quality Electronic Design (ISQED).