A Charge-Digital Hybrid Compute-In-Memory Macro with full precision 8-bit Multiply-Accumulation for Edge Computing Devices

Compute-in-memory (CIM) is emerging as a new computing architecture to overcome the high energy consumption of edge-side AI and IoT devices. When performing high-precision neural network calculations, analog CIM and digital CIM have their own advantages and disadvantages. In this paper, we combine the advantages of high energy efficiency of analog CIM and high accuracy of digital CIM to propose a charge-digital hybrid CIM (CDH-CIM) macro. By placing the high bits in the digital domain and the low bits in the charge domain, the multiply-accumulation (MAC) operation of 8b input activations (lAs) and 8b weights is achieved with no precision loss. The proposed CDH-CIM macro is designed using 22nm FDSOI CMOS process. Simulation shows that the macro achieves 6.98~11.0 TOPS/W at 0.8V and 71.92% inference accuracy when performing CIFAR-100 dataset.

[1]  Kyeongho Lee,et al.  A Charge-Sharing based 8T SRAM In-Memory Computing for Edge DNN Acceleration , 2021, 2021 58th ACM/IEEE Design Automation Conference (DAC).

[2]  Xin Si,et al.  Design Challenges and Methodology of High-Performance SRAM-Based Compute-in-Memory for AI Edge Devices , 2021, 2021 International Conference on UK-China Emerging Technologies (UCET).

[3]  Jun Yang,et al.  Challenge and Trend of SRAM Based Computation-in-Memory Circuits for AI Edge Devices , 2021, 2021 IEEE 14th International Conference on ASIC (ASICON).

[4]  Jun Yang,et al.  Design Methodology towards High-Precision SRAM based Computation-in-Memory for AI Edge Devices , 2021, 2021 18th International SoC Design Conference (ISOCC).

[5]  J. Kim,et al.  A Charge-domain 10T SRAM based In-Memory-Computing Macro for Low Energy and Highly Accurate DNN inference , 2021, 2021 18th International SoC Design Conference (ISOCC).

[6]  Cheng-Xin Xue,et al.  Challenges and Trends of SRAM-Based Computing-In-Memory for AI Edge Devices , 2021, IEEE Transactions on Circuits and Systems I: Regular Papers.

[7]  Chung-Chuan Lo,et al.  A Local Computing Cell and 6T SRAM-Based Computing-in-Memory Macro With 8-b MAC Operation for Edge AI Chips , 2021, IEEE Journal of Solid-State Circuits.

[8]  Hidehiro Fujiwara,et al.  An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).

[9]  Meng-Fan Chang,et al.  15.5 A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[10]  Meng-Fan Chang,et al.  A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors , 2020, IEEE Journal of Solid-State Circuits.

[11]  Jaydeep P. Kulkarni,et al.  A 12.08-TOPS/W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines for Energy Efficient Edge Computing , 2020, IEEE Journal of Solid-State Circuits.

[12]  Zhengyu Chen,et al.  A Time-Domain Computing Accelerated Image Recognition Processor With Efficient Time Encoding and Non-Linear Logic Operation , 2019, IEEE Journal of Solid-State Circuits.

[13]  Mingoo Seok,et al.  C3SRAM: In-Memory-Computing SRAM Macro Based on Capacitive-Coupling Computing , 2019, IEEE Solid-State Circuits Letters.

[14]  Qian Chen,et al.  A 1-16b Precision Reconfigurable Digital In-Memory Computing Macro Featuring Column-MAC Architecture and Bit-Serial Computation , 2019, ESSCIRC 2019 - IEEE 45th European Solid State Circuits Conference (ESSCIRC).

[15]  Jun Yang,et al.  24.4 Sandwich-RAM: An Energy-Efficient In-Memory BWN Architecture with Pulse-Width Modulation , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).

[16]  Hossein Valavi,et al.  A Mixed-Signal Binarized Convolutional-Neural-Network Accelerator Integrating Dense Weight Storage and Multiplication for Reduced Data Movement , 2018, 2018 IEEE Symposium on VLSI Circuits.

[17]  Jun Deguchi,et al.  Time-domain neural network: A 48.5 TSOp/s/W neuromorphic chip optimized for deep learning and CMOS technology , 2016, 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC).

[18]  Boris Murmann,et al.  An 8-bit, 16 input, 3.2 pJ/op switched-capacitor dot product circuit in 28-nm FDSOI CMOS , 2016, 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC).

[19]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).