论文信息 - Acceleration of Convolutional Neural Network Using FFT-Based Split Convolutions

Acceleration of Convolutional Neural Network Using FFT-Based Split Convolutions

Convolutional neural networks (CNNs) have a large number of variables and hence suffer from a complexity problem for their implementation. Different methods and techniques have developed to alleviate the problem of CNN's complexity, such as quantization, pruning, etc. Among the different simplification methods, computation in the Fourier domain is regarded as a new paradigm for the acceleration of CNNs. Recent studies on Fast Fourier Transform (FFT) based CNN aiming at simplifying the computations required for FFT. However, there is a lot of space for working on the reduction of the computational complexity of FFT. In this paper, a new method for CNN processing in the FFT domain is proposed, which is based on input splitting. There are problems in the computation of FFT using small kernels in situations such as CNN. Splitting can be considered as an effective solution for such issues aroused by small kernels. Using splitting redundancy, such as overlap-and-add, is reduced and, efficiency is increased. Hardware implementation of the proposed FFT method, as well as different analyses of the complexity, are performed to demonstrate the proper performance of the proposed method.

[1] Yu Yao,et al. A Fast Algorithm for Convolutional Neural Networks Using Tile-based Fast Fourier Transforms , 2019, Neural Processing Letters.

[2] Shadrokh Samavi,et al. Modeling of Pruning Techniques for Deep Neural Networks Simplification , 2020, ArXiv.

[3] Nader Karimi,et al. Multiple Abnormality Detection for Automatic Medical Image Diagnosis Using Bifurcated Convolutional Neural Network , 2018, Biomed. Signal Process. Control..

[4] Tyler Highlander,et al. Very Efficient Training of Convolutional Neural Networks using Fast Fourier Transform and Overlap-and-Add , 2016, BMVC.

[5] Nhan Nguyen-Thanh,et al. Energy efficient techniques using FFT for deep convolutional neural networks , 2016, 2016 International Conference on Advanced Technologies for Communications (ATC).

[6] Tinoosh Mohsenin,et al. Accelerating Convolutional Neural Network With FFT on Embedded Hardware , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7] Shadrokh Samavi,et al. Modeling Neural Architecture Search Methods for Deep Networks , 2019, ArXiv.

[8] Jasper Snoek,et al. Spectral Representations for Convolutional Neural Networks , 2015, NIPS.

[9] Tinoosh Mohsenin,et al. Accelerating convolutional neural network with FFT on tiny cores , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[10] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[11] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.

[12] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[13] Vivienne Sze,et al. Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators , 2017, IEEE Micro.

[14] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.

[15] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Warren J. Gross,et al. An Architecture to Accelerate Convolution in Deep Neural Networks , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[17] Martin Wistuba,et al. A Survey on Neural Architecture Search , 2019, ArXiv.

[18] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Zhongfeng Wang,et al. An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[21] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.