An 8-bit Ring-Amplifier Based Mixed-Signal MAC Circuit With Full Digital Interface and Variable Accumulation Length

An 8-bit switched-capacitor multiply-and-accumulator (MAC) in 65nm CMOS is presented. Based on a cascaded low-power ring-amplifier-based switched-capacitor DACs, the MAC circuit features a programmable accumulation length in MAC computation. Fabricated in 65nm CMOS, the prototype MAC circuit achieves a precision-scaled energy efficiency of 1.32fJ per MAC operation, which is comparable to other state-of-the-art MAC circuits, along with best-in-class linearity. The noise performance has been verified using four real-world convolutional neural networks (CNNs) and 10,000-image data sets with up to 1,000 classes with an accuracy drop of less than 2% compared to the baseline 32-bit floating-point MAC.

[1]  Meng-Fan Chang,et al.  Sticker: A 0.41-62.1 TOPS/W 8Bit Neural Network Processor with Multi-Sparsity Compatible Convolution Arrays and Online Tuning Acceleration for Fully Connected Layers , 2018, 2018 IEEE Symposium on VLSI Circuits.

[2]  Ryutaro Yasuhara,et al.  A 4M Synapses integrated Analog ReRAM based 66.5 TOPS/W Neural-Network Processor with Cell Current Controlled Writing and Flexible Network Architecture , 2018, 2018 IEEE Symposium on VLSI Technology.

[3]  Bankman Daniel,et al.  An 8-bit, 16 input, 3.2 pJ/op switched-capacitor dot product circuit in 28-nm FDSOI CMOS , 2016 .

[4]  Sujan Kumar Gonugondla,et al.  A 42pJ/decision 3.12TOPS/W robust in-memory machine learning classifier with on-chip training , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[5]  Kazuki Sobue,et al.  Ring Amplifiers for Switched Capacitor Circuits , 2012, IEEE Journal of Solid-State Circuits.

[6]  Sujan Kumar Gonugondla,et al.  A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array , 2018, IEEE Journal of Solid-State Circuits.

[7]  Shih-Chieh Chang,et al.  15.2 A 28nm 64Kb Inference-Training Two-Way Transpose Multibit 6T SRAM Compute-in-Memory Macro for AI Edge Chips , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[8]  Yosuke Toyama,et al.  PhaseMAC: A 14 TOPS/W 8bit GRO Based Phase Domain MAC Circuit for in-Sensor-Computed Deep Learning Accelerators , 2018, 2018 IEEE Symposium on VLSI Circuits.

[9]  Meng-Fan Chang,et al.  A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[10]  Meng-Fan Chang,et al.  A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[11]  Anantha Chandrakasan,et al.  Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).

[12]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[14]  Meng-Fan Chang,et al.  24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).