A Fully Integrated Reprogrammable CMOS-RRAM Compute-in-Memory Coprocessor for Neuromorphic Applications

Analog compute-in-memory with resistive random access memory (RRAM) devices promises to overcome the data movement bottleneck in data-intensive artificial intelligence (AI) and machine learning. RRAM crossbar arrays improve the efficiency of vector-matrix multiplications (VMMs), which is a vital operation in these applications. The prototype IC is the first complete, fully integrated analog-RRAM CMOS coprocessor. This article focuses on the digital and analog circuitry that supports efficient and flexible RRAM-based computation. A passive $54\times108$ RRAM crossbar array performs VMM in the analog domain. Specialized mixed-signal circuits stimulate and read the outputs of the RRAM crossbar. The single-chip CMOS prototype includes a reduced instruction set computer (RISC) processor interfaced to a memory-mapped mixed-signal core. In the mixed-signal core, ADCs and DACs interface with the passive RRAM crossbar. The RISC processor controls the mixed-signal circuits and the algorithm data path. The system is fully programmable and supports forward and backward propagation. As proof of concept, a fully integrated 0.18- $\mu \text{m}$ CMOS prototype with a postprocessed RRAM array demonstrates several key functions of machine learning, including online learning. The mixed-signal core consumes 64 mW at an operating frequency of 148 MHz. The total system power consumption considering the mixed-signal circuitry, the digital processor, and the passive RRAM array is 307 mW. The maximum theoretical throughput is 2.6 GOPS at an efficiency of 8.5 GOPS/W.

[1]  Naveen Verma,et al.  A machine-learning classifier implemented in a standard 6T SRAM array , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).

[2]  Zhengya Zhang,et al.  A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations , 2019, Nature Electronics.

[3]  Sujan Kumar Gonugondla,et al.  A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array , 2018, IEEE Journal of Solid-State Circuits.

[4]  Wei Lu,et al.  Short-term Memory to Long-term Memory Transition in a Nanoscale Memristor , 2022 .

[5]  Michael P. Flynn,et al.  A 1 mW 71.5 dB SNDR 50 MS/s 13 bit Fully Differential Ring Amplifier Based SAR-Assisted Pipeline ADC , 2015, IEEE Journal of Solid-State Circuits.

[6]  Jiaming Zhang,et al.  Analogue signal and image processing with large memristor crossbars , 2017, Nature Electronics.

[7]  Meng-Fan Chang,et al.  A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[8]  Daehwa Paik,et al.  A low-noise self-calibrating dynamic comparator for high-speed ADCs , 2008, 2008 IEEE Asian Solid-State Circuits Conference.

[9]  Khaled N. Salama,et al.  Memristor-based memory: The sneak paths problem and solutions , 2013, Microelectron. J..

[10]  Meng-Fan Chang,et al.  24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[11]  Siddharth Gaba,et al.  Synaptic behaviors and modeling of a metal oxide memristive device , 2011 .

[12]  Emmanuelle J. Merced-Grafals,et al.  Repeatable, accurate, and high speed multi-level programming of memristor 1T1R arrays for power efficient analog computing applications , 2016, Nanotechnology.

[13]  Michael P. Flynn,et al.  A calibration-free 2.3 mW 73.2 dB SNDR 15b 100 MS/s four-stage fully differential ring amplifier based SAR-assisted pipeline ADC , 2017, 2017 Symposium on VLSI Circuits.

[14]  Jiantao Zhou,et al.  Crossbar RRAM Arrays: Selector Device Requirements During Write Operation , 2014, IEEE Transactions on Electron Devices.

[15]  Wei D. Lu,et al.  Sparse coding with memristor networks. , 2017, Nature nanotechnology.