A-MnasNet: Augmented MnasNet for Computer Vision

Convolutional Neural Networks (CNNs) play an essential role in Deep Learning. They are extensively used in Computer Vision. They are complicated but very effective in extracting features from an image or a video stream. After AlexNet [5] won the ILSVRC [8] in 2012, there was a drastic increase in research related with CNNs. Many state-of-the-art architectures like VGG Net [12], GoogleNet [13], ResNet [18], Inception-v4 [14], Inception-Resnet-v2 [14], ShuffleNet [23], Xception [24], MobileNet [6], MobileNetV2 [7], SqueezeNet [16], SqueezeNext [17] and many more were introduced. The trend behind the research depicts an increase in the number of layers of CNN to make them more efficient but with that the size of the model increased as well. This problem was fixed with the advent of new algorithms which resulted in a decrease in model size. As a result, today we have CNN models which are implemented on mobile devices. These mobile models are small and fast which in turn reduce the computational cost of the embedded system. This paper resembles similar idea, it proposes a new model Augmented MnasNet (A-MnasNet) which has been derived from MnasNet [1]. The model is trained with CIFAR-10 [4] dataset and has a validation accuracy of 96.89% and a model size of 11.6 MB. It outperforms its baseline architecture MnasNet which has a validation accuracy of 80.8% and a model size of 12.7 MB when trained with CIFAR-10.

[1]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[2]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[3]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Duo Li,et al.  HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[7]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[8]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[9]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Md. Zakir Hossain,et al.  A Comprehensive Survey of Deep Learning for Image Captioning , 2018, ACM Comput. Surv..

[12]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[14]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[15]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yixin Chen,et al.  Compressing Neural Networks with the Hashing Trick , 2015, ICML.

[17]  Chen Wang Convolutional Neural Network for Image Classification . , 2015 .

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[22]  Kurt Keutzer,et al.  SqueezeNext: Hardware-Aware Neural Network Design , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[23]  Yi Yang,et al.  Random Erasing Data Augmentation , 2017, AAAI.

[24]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jian Cheng,et al.  Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).