DcaNAS: efficient convolutional network Design for Desktop CPU platforms

The hardware platform is a significant consideration in efficient CNN model design. Most lightweight networks are based on GPUs and mobile devices. However, they are usually not efficient nor fast enough for desktop CPU platforms. In this paper, we aim to explore the design of highly-efficient convolutional architectures for desktop CPU platforms. To achieve our goal, we first derive a series of CNN model design guidelines for CPU-based devices by comparing different computing platforms. Based on these proposed guidelines, we further present a Desktop CPU-Aware network architecture search (DcaNAS) to search for the optimal network structure with lower CPU latency. By combining automatic search and manual design, our DcaNAS achieves better flexibility and efficiency. On the ImageNet benchmark, we employ DcaNAS to produce two CPU-based lightweight CNN models: DcaNAS-L for higher accuracy and DcaNAS-S for faster speed. On a single CPU core, DcaNAS-L achieves 78.8% Top-1 (94.6% Top-5) accuracy at 13.6 FPS (73.5 ms), and our DcaNAS-S achieves extremely low CPU latency (43.1 ms). The results show that our DcaNAS method can obtain new state-of-the-art CPU-based networks.

[1]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Xuesen Zhang,et al.  EcoNAS: Finding Proxies for Economical Neural Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Bo Zhang,et al.  MoGA: Searching Beyond Mobilenetv3 , 2019, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Kaiming He,et al.  Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Xiaojun Chang,et al.  Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[8]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Jiahui Yu,et al.  AutoSlim: Towards One-Shot Architecture Search for Channel Numbers , 2019 .

[12]  Victor Adrian Prisacariu,et al.  DSConv: Efficient Convolution Operator , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[14]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[16]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[17]  Qi Tian,et al.  CARS: Continuous Evolution for Efficient Neural Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[20]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[21]  Bo Zhang,et al.  SCARLET-NAS: Bridging the Gap between Stability and Scalability in Weight-sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[22]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[25]  Yuandong Tian,et al.  FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[27]  Tao Huang,et al.  GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yuxing Peng,et al.  ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[30]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[31]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Jianyuan Guo,et al.  GhostNet: More Features From Cheap Operations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Qian Zhang,et al.  Densely Connected Search Space for More Flexible Neural Architecture Search , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Charles X. Ling,et al.  Pelee: A Real-Time Object Detection System on Mobile Devices , 2018, NeurIPS.

[37]  Vinay P. Namboodiri,et al.  HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Kurt Keutzer,et al.  Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[41]  Wei Wu,et al.  Improving One-Shot NAS by Suppressing the Posterior Fading , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Xiangyu Zhang,et al.  DetNet: A Backbone network for Object Detection , 2018, ArXiv.

[43]  Jian Sun,et al.  DetNAS: Backbone Search for Object Detection , 2019, NeurIPS.

[44]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[45]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[46]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[47]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.