Real-time Error Monitoring System Considering Endurance and Data-retention Characteristics of TaOX-based ReRAM Storage with Workloads at Data Centers

Approximate Computing attracts attention due to reducing power consumption and improving performance by tolerating errors. However, to use Approximate Computing, systems should control amounts of errors occurred in storage. This paper proposes real-time error monitoring system using ReRAM storage in order to understand how many errors occur in storage. To evaluate this system, measured ReRAM data and one-week long workload logs at data centers are input to the proposed system. The proposed system outputs Total bit error rate (BER) calculated from Set/Reset cycles and write interval time. In addition, the proposed system reveals that Total BER of some sectors exceeds the error correction limit set to 0.9 code rate of BCH ECC for a week.

[1]  Tomoko Ogura Iwasaki,et al.  Application Driven SCM and NAND Flash Hybrid SSD Design for Data-Centric Computing System , 2015, 2015 IEEE International Memory Workshop (IMW).

[2]  Ken Takeuchi,et al.  Application-Induced Cell Reliability Variability-Aware Approximate Computing in TaOx-based ReRAM Data Center Storage for Machine Learning , 2019, 2019 Symposium on VLSI Technology.

[3]  T. Takagi,et al.  Conductive Filament Scaling of ${\rm TaO}_{\rm x}$ Bipolar ReRAM for Improving Data Retention Under Low Operation Current , 2013, IEEE Transactions on Electron Devices.

[4]  Ken Takeuchi,et al.  3D-NAND Flash Solid-State Drive (SSD) for Deep Neural Network Weight Storage of IoT Edge Devices with 700x Data-Retention Lifetime Extention , 2018, 2018 IEEE International Memory Workshop (IMW).

[5]  In-Cheol Park,et al.  6.4Gb/s multi-threaded BCH encoder and decoder for multi-channel SSD controllers , 2012, 2012 IEEE International Solid-State Circuits Conference.

[6]  Ken Takeuchi,et al.  5x Reliability Enhanced 40nm TaOx Approximate-ReRAM with Domain-Specific Computing for Real-time Image Recognition of IoT Edge Devices , 2018, 2018 IEEE Symposium on VLSI Technology.

[7]  Jeffrey S. Vetter,et al.  A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems , 2016, IEEE Transactions on Parallel and Distributed Systems.

[8]  Ken Takeuchi,et al.  x11 performance increase, x6.9 endurance enhancement, 93% energy reduction of 3D TSV-integrated hybrid ReRAM/MLC NAND SSDs by data fragmentation suppression , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[9]  Ryutaro Yasuhara,et al.  Suppression of endurance-stressed data-retention failures of 40nm TaOx-based ReRAM , 2018, 2018 IEEE International Reliability Physics Symposium (IRPS).

[10]  Ryutaro Yasuhara,et al.  Comprehensive Analysis of Data-Retention and Endurance Trade-Off of 40nm TaOx-based ReRAM , 2019, 2019 IEEE International Reliability Physics Symposium (IRPS).

[11]  Z. Wei,et al.  Highly reliable TaOx ReRAM and direct evidence of redox reaction mechanism , 2008, 2008 IEEE International Electron Devices Meeting.

[12]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[13]  Yuan Xie,et al.  A Study on Practically Unlimited Endurance of STT-MRAM , 2017, IEEE Transactions on Electron Devices.