Image recognition based on multi-scale dilated lightweight network model

Lightweight model is mainly applied to maintain performance and reduce the amount of parameters, simplifying the complex laboratory model to the mobile embedded device. We present a multi-scale dilated lightweight network model for image recognition. ShuffleNet is an classical lightweight neural network that proposes channel shuffle to help exchange information between groups during group convolution. However, ShuffleNet does not make full use of each group of information after channel shuffle. Since channel shuffle guarantees that each group contains the information of other groups, in this paper, we propose to process the grouping data with different dilated convolution, and obtain the multi-scale information of different receptive fields without increasing parameters. At the same time, we make an improvement on the network model to reduce the gridding artifacts caused by dilated convolution. Experiments on CIFAR-10 and EMNIST show that the improved algorithm performs better than traditional method.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  Zhengyang Wang,et al.  Smoothed dilated convolutions for improved dense prediction , 2018, Data Mining and Knowledge Discovery.

[4]  Gregory Cohen,et al.  EMNIST: Extending MNIST to handwritten letters , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[5]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[6]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[7]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[8]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[10]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[11]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[13]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[14]  BengioYoshua,et al.  Quantized neural networks , 2017 .

[15]  Song Han,et al.  Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.

[16]  Linda G. Shapiro,et al.  ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation , 2018, ECCV.

[17]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).