ADC/DAC-Free Analog Acceleration of Deep Neural Networks with Frequency Transformation

The edge processing of deep neural networks (DNNs) is becoming increasingly important due to its ability to extract valuable information directly at the data source to minimize latency and energy consumption. Frequency-domain model compression, such as with the Walsh-Hadamard transform (WHT), has been identified as an efficient alternative. However, the benefits of frequency-domain processing are often offset by the increased multiply-accumulate (MAC) operations required. This paper proposes a novel approach to an energy-efficient acceleration of frequency-domain neural networks by utilizing analog-domain frequency-based tensor transformations. Our approach offers unique opportunities to enhance computational efficiency, resulting in several high-level advantages, including array micro-architecture with parallelism, ADC/DAC-free analog computations, and increased output sparsity. Our approach achieves more compact cells by eliminating the need for trainable parameters in the transformation matrix. Moreover, our novel array micro-architecture enables adaptive stitching of cells column-wise and row-wise, thereby facilitating perfect parallelism in computations. Additionally, our scheme enables ADC/DAC-free computations by training against highly quantized matrix-vector products, leveraging the parameter-free nature of matrix multiplications. Another crucial aspect of our design is its ability to handle signed-bit processing for frequency-based transformations. This leads to increased output sparsity and reduced digitization workload. On a 16$\times$16 crossbars, for 8-bit input processing, the proposed approach achieves the energy efficiency of 1602 tera operations per second per Watt (TOPS/W) without early termination strategy and 5311 TOPS/W with early termination strategy at VDD = 0.8 V.

[1]  K. Cho,et al.  A hybrid machine learning framework for clad characteristics prediction in metal additive manufacturing , 2023, ArXiv.

[2]  Maeesha Binte Hashem,et al.  Memory-Immersed Collaborative Digitization for Area-Efficient Compute-in-Memory Deep Learning , 2023, 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS).

[3]  A. Cetin,et al.  A Hybrid Quantum-Classical Approach based on the Hadamard Transform for the Convolutional Layer , 2023, ICML.

[4]  T. Mohsenin,et al.  MLAE2: Metareasoning for Latency-Aware Energy-Efficient Autonomous Nano-Drones , 2023, International Symposium on Circuits and Systems.

[5]  X. Chen,et al.  Evaluation Platform of Time-Domain Computing-in-Memory Circuits , 2023, IEEE Transactions on Circuits and Systems II: Express Briefs.

[6]  A. Cetin,et al.  DCT Perceptron Layer: A Transform Domain Approach for Convolution Layer , 2022, ArXiv.

[7]  Alex C. Stutts,et al.  Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies , 2022, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  H. Yoo,et al.  Neuro-CIM: A 310.4 TOPS/W Neuromorphic Computing-in-Memory Processor with Low WL/BL activity and Digital-Analog Mixed-mode Neuron Firing , 2022, 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits).

[9]  A. Mallik,et al.  DIANA: An End-to-End Energy-Efficient Digital and ANAlog Hybrid Neural Network SoC , 2022, 2022 IEEE International Solid- State Circuits Conference (ISSCC).

[10]  A. Cetin,et al.  Block Walsh–Hadamard Transform-based Binary Layers in Deep Neural Networks , 2022, ACM Trans. Embed. Comput. Syst..

[11]  Theja Tulabandhula,et al.  Ultralow-Power Localization of Insect-Scale Drones: Interplay of Probabilistic Filtering and Compute-in-Memory , 2022, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[12]  Wilfred Gomes,et al.  MC-CIM: Compute-in-Memory With Monte-Carlo Dropouts for Bayesian Edge Intelligence , 2021, IEEE Transactions on Circuits and Systems I: Regular Papers.

[13]  Antonio González,et al.  DNN pruning with principal component analysis and connection importance estimation , 2021, J. Syst. Archit..

[14]  Hongyang Jia,et al.  Scalable and Programmable Neural Network Inference Accelerator Based on In-Memory Computing , 2021, IEEE Journal of Solid-State Circuits.

[15]  Diaa Badawi,et al.  Fast Walsh-Hadamard Transform and Smooth-Thresholding Based Binary Layers in Deep Neural Networks , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Xi Huang,et al.  EasiEdge: A Novel Global Deep Neural Networks Pruning Method for Efficient Edge Computing , 2021, IEEE Internet of Things Journal.

[17]  Mahmut E. Sinangil,et al.  A 7-nm Compute-in-Memory SRAM Macro Supporting Multi-Bit Input, Weight and Output and Achieving 351 TOPS/W and 372.4 GOPS , 2021, IEEE Journal of Solid-State Circuits.

[18]  Daniel L. K. Yamins,et al.  Pruning neural networks without any data by iteratively conserving synaptic flow , 2020, NeurIPS.

[19]  Yuhao Wang,et al.  Learning in the Frequency Domain , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Meng-Fan Chang,et al.  15.4 A 22nm 2Mb ReRAM Compute-in-Memory Macro with 121-28TOPS/W for Multibit MAC Computing for Tiny AI Edge Devices , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[21]  Jonathan Chang,et al.  15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[22]  Xukan Ran,et al.  Deep Learning With Edge Computing: A Review , 2019, Proceedings of the IEEE.

[23]  Maurizio Filippone,et al.  Walsh-Hadamard Variational Inference for Bayesian Deep Learning , 2019, NeurIPS.

[24]  Zheng Ma,et al.  Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks , 2019, Communications in Computational Physics.

[25]  Ruiqin Xiong,et al.  Frequency-Domain Dynamic Pruning for Convolutional Neural Networks , 2018, NeurIPS.

[26]  Trevor Darrell,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[27]  Miguel Á. Carreira-Perpiñán,et al.  "Learning-Compression" Algorithms for Neural Net Pruning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Yusuke Hioka,et al.  End-to-End Sound Source Enhancement Using Deep Neural Network in the Modified Discrete Cosine Transform Domain , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  El-Sayed M. El-Horbaty,et al.  Classification using deep learning neural networks for brain tumors , 2017, Future Computing and Informatics Journal.

[30]  Qinru Qiu,et al.  Energy-efficient, high-performance, highly-compressed deep neural network design using block-circulant matrices , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[31]  Yixin Chen,et al.  Compressing Convolutional Neural Networks in the Frequency Domain , 2015, KDD.

[32]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[33]  Ming-Dou Ker,et al.  Optimization of Guard Ring Structures to Improve Latchup Immunity in an 18 V DDDMOS Process , 2016, IEEE Transactions on Electron Devices.

[34]  Wonyong Sung,et al.  Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..

[35]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[36]  Gonzalo Mateos,et al.  Health Monitoring and Management Using Internet-of-Things (IoT) Sensing with Cloud-Based Processing: Opportunities and Challenges , 2015, 2015 IEEE International Conference on Services Computing.

[37]  Massoud Pedram,et al.  Performance Comparisons Between 7-nm FinFET and Conventional Bulk CMOS Standard Cell Libraries , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.

[38]  Stephen Seidman,et al.  Computing , 2014, Inroads.

[39]  H. T. Lin,et al.  A 16nm FinFET CMOS technology for mobile SoC and computing applications , 2013, 2013 IEEE International Electron Devices Meeting.

[40]  Yu (Kevin) Cao,et al.  What is Predictive Technology Model (PTM)? , 2009, SIGD.

[41]  P.R. Gray,et al.  A low-noise chopper-stabilized differential switched-capacitor filtering technique , 1981, IEEE Journal of Solid-State Circuits.

[42]  A. Chakrabarty,et al.  ENOS: Energy-Aware Network Operator Search in Deep Neural Networks , 2022, IEEE Access.

[43]  Yifan Gong,et al.  Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.