Sparsity-Aware Clamping Readout Scheme for High Parallelism and Low Power Nonvolatile Computing-in-Memory Based on Resistive Memory

The input parallelism of resistive memory (RRAM) based nonvolatile computing-in-memory (nvCIM) structure is limited by the signal margin as well as the readout precision. In this work, we propose a sparsity-aware clamping (SAC) scheme and its circuit implementation for nvCIM by co-design of circuit and algorithm. It can adaptively tune the quantized range and resolution of the readout circuit according to the degree of sparsity in neural network models. As a result, the SAC scheme can effectively increase the input parallelism of nvCIMs without incurring degradation on the signal margin or increasing the hardware cost for analogue readout. A case study on processing a multi-layer perceptron (MLP) model with the proposed nvCIM structure shows that the SAC scheme can improve the throughput by 2 times and increase the energy efficiency by 25.35% with negligible inference accuracy loss.