CDS-RSRAM: a Reconfigurable SRAM Architecture to Reduce Read Power with Column Data Segmentation

SRAM access takes a significant part of on-chip power consumption in many signal processing systems. Reconfigurable data-adaptive SRAMs (RSRAM) can save considerable read power by utilizing data patterns. In these RSRAM designs, the column size (i.e., the number of cells in one column) of the cell array defines the granularity of data pattern exploitation. However, the column size cannot be too small due to circuit constraints, which makes finer-grained data features hidden and suppresses RSRAM's advantage of low-power read. In this paper, we propose a reconfigurable SRAM architecture with column data segmentation (CDS-RSRAM) to break this limitation to exploit better data patterns without decreasing the column size. We partition data in one column into several segments and perform statistical analysis on every segment respectively. Each data segment has one exclusive flag bit to control its working mode while reading. This architecture can leverage data patterns at finer granularity and magnify RSRAM's advantage of low-power read. We also make a thorough overhead analysis and improve the mode decision strategy to minimize the power overheads. The simulation results show that compared with the original RSRAM, the proposed architecture saves up to 36.8% read power with 8.8% area overhead. Compared with 8T SRAM, the total power saving can be up to 77.1%.

[1]  David Harris,et al.  CMOS VLSI Design: A Circuits and Systems Perspective , 2004 .

[2]  Yunjae Suh,et al.  A 256MB synchronous-burst DDR SRAM with hierarchical bit-line architecture for mobile applications , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[3]  Anantha Chandrakasan,et al.  Application-Specific SRAM Design Using Output Prediction to Reduce Bit-Line Switching Activity and Statistically Gated Sense Amplifiers for Up to 1.9$\times$ Lower Energy/Access , 2013, IEEE Journal of Solid-State Circuits.

[4]  Masahiko Yoshimoto,et al.  Novel Video Memory Reduces 45% of Bitline Power Using Majority Logic and Data-Bit Reordering , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Sujan Kumar Gonugondla,et al.  A 42pJ/decision 3.12TOPS/W robust in-memory machine learning classifier with on-chip training , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[6]  Marian Verhelst,et al.  An always-on 3.8μJ/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[7]  Anantha P. Chandrakasan,et al.  Energy-Efficient Reconfigurable SRAM: Reducing Read Power Through Data Statistics , 2017, IEEE Journal of Solid-State Circuits.

[8]  Sang H. Dhong,et al.  Low-power design approach of 11FO4 256-Kbyte embedded SRAM for the synergistic processor element of a Cell processor , 2005, IEEE Micro.

[9]  D. Plass,et al.  A 5.6GHz 64kB Dual-Read Data Cache for the POWER6TM Processor , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[10]  Wei Jin,et al.  Data-Driven Low-Cost On-Chip Memory with Adaptive Power-Quality Trade-off for Mobile Video Streaming , 2016, ISLPED.

[11]  Desoli Mr Giuseppe,et al.  14.1 A 2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems , 2017 .

[12]  Zhe Chen,et al.  Energy-Efficient SRAM Design with Data-Aware Dual-Modes L0T Storage Cell for CNN Processors , 2018, 2018 31st IEEE International System-on-Chip Conference (SOCC).

[13]  Dhiraj K. Pradhan,et al.  Single ended 6T SRAM with isolated read-port for low-power embedded systems , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[14]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.