论文信息 - Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems

Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems

In this paper, we explore how to design lightweight CNN architecture for embedded computing systems. We propose L-Mobilenet model for ZYNQ based hardware platform. L-Mobilenet can adapt well to hardware computing and accelerating, and its network structure is inspired by the state-of-the-art work of Inception-Resnet and Mobilenet-V2, which can effectively reduce parameters and delay while maintaining the accuracy of inference. We deploy our L-Mobilenet model to GPU and ZYNQ embedded platform for fully evaluating the performance of our design. By measuring with cifar10 and cifar100 datasets, L-Mobilenet model is able to gain 3× speed up and 3.7× fewer parameters than MobileNet-V2 while maintaining a similar accuracy. It also can obtain 2× speed up and 1.5× fewer parameters than Shufflenet-V2 while maintaining the same accuracy. Experiments show that our network model can obtain better performance because of the special considerations for hardware accelerating and software-hardware co-design strategies in our L-Mobilenet bottleneck architecture.

[1] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[3] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[5] Kurt Keutzer,et al. SqueezeNext: Hardware-Aware Neural Network Design , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[6] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[7] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[9] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[11] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[12] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[13] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[14] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[15] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Jingdong Wang,et al. Interleaved Group Convolutions for Deep Neural Networks , 2017, ArXiv.

[18] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[19] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Kilian Q. Weinberger,et al. CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[23] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[25] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[26] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[27] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.

[28] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29] H. Robbins. A Stochastic Approximation Method , 1951 .

[30] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.