Locally Free Weight Sharing for Network Width Search

Searching for network width is an effective way to slim deep neural networks with hardware budgets. With this aim, a one-shot supernet is usually leveraged as a performance evaluator to rank the performance w.r.t. different width. Nevertheless, current methods mainly follow a manually fixed weight sharing pattern, which is limited to distinguish the performance gap of different width. In this paper, to better evaluate each width, we propose a loCAlly FrEe weight sharing strategy (CafeNet) accordingly. In CafeNet, weights are more freely shared, and each width is jointly indicated by its base channels and free channels, where free channels are supposed to locate freely in a local zone to better represent each width. Besides, we propose to further reduce the search space by leveraging our introduced FLOPs-sensitive bins. As a result, our CafeNet can be trained stochastically and get optimized within a min-min strategy. Extensive experiments on ImageNet, CIFAR-10, CelebA and MS COCO dataset have verified our superiority comparing to other state-of-the-art baselines. For example, our method can further boost the benchmark NAS network EfficientNet-B0 by 0.41% via searching its width more delicately.

[1]  Kilian Q. Weinberger,et al.  Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.

[2]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[3]  Changshui Zhang,et al.  Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space , 2020, NeurIPS.

[4]  Tao Huang,et al.  GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jing Liu,et al.  Discrimination-aware Channel Pruning for Deep Neural Networks , 2018, NeurIPS.

[6]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Tao Huang,et al.  Explicitly Learning Topology for Differentiable Neural Architecture Search , 2020, ArXiv.

[8]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[9]  Baoyuan Wu,et al.  Compressing Convolutional Neural Networks via Factorized Convolutional Filters , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Hao Li,et al.  MuffNet: Multi-Layer Feature Federation for Mobile Deep Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[12]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[13]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[14]  Jiahui Yu,et al.  AutoSlim: Towards One-Shot Architecture Search for Channel Numbers , 2019 .

[15]  Yi Yang,et al.  Meta Filter Pruning to Accelerate Deep Convolutional Neural Networks , 2019, ArXiv.

[16]  Mingjie Sun,et al.  Rethinking the Value of Network Pruning , 2018, ICLR.

[17]  L. Darrell Whitley,et al.  Genetic algorithms and neural networks: optimizing connections and connectivity , 1990, Parallel Comput..

[18]  Dacheng Tao,et al.  Learning from Multiple Teacher Networks , 2017, KDD.

[19]  Jungong Han,et al.  Approximated Oracle Filter Pruning for Destructive CNN Width Optimization , 2019, ICML.

[20]  Yi Yang,et al.  More is Less: A More Complicated Network with Less Inference Complexity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Hanan Samet,et al.  Pruning Filters for Efficient ConvNets , 2016, ICLR.

[23]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[24]  Yibo Hu,et al.  TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search , 2020, ECCV.

[25]  Xiangyu Zhang,et al.  MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[27]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Chao Xu,et al.  Reborn Filters: Pruning Convolutional Neural Networks with Limited Data , 2020, AAAI.

[30]  Yujie Wang,et al.  DMCP: Differentiable Markov Channel Pruning for Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Ping Liu,et al.  Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Boxin Shi,et al.  Bringing Giant Neural Networks Down to Earth with Unlabeled Data , 2019, ArXiv.

[33]  Chen Qian,et al.  Data Agnostic Filter Gating for Efficient Deep Networks , 2020, ArXiv.

[34]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[35]  Adam R. Klivans,et al.  Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection , 2020, ICML.

[36]  Vladlen Koltun,et al.  Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.

[37]  Fei Wang,et al.  ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding , 2020, NeurIPS.

[38]  Longhui Wei,et al.  Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Xiangyu Zhang,et al.  Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Zhouchen Lin,et al.  Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Yi Yang,et al.  Network Pruning via Transformable Architecture Search , 2019, NeurIPS.

[43]  Ping Wang,et al.  Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks , 2019, NeurIPS.

[44]  Yi Yang,et al.  Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks , 2018, IJCAI.

[45]  Chang Xu,et al.  Learning Student Networks with Few Data , 2020, AAAI.

[46]  Peter J. B. Hancock,et al.  Pruning Neural Nets by Genetic Algorithm , 1992 .

[47]  Jianxin Wu,et al.  AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference , 2018, Pattern Recognit..

[48]  Liujuan Cao,et al.  Towards Optimal Structured CNN Pruning via Generative Adversarial Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[50]  Song Han,et al.  AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.

[51]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[52]  Zhiru Zhang,et al.  Channel Gating Neural Networks , 2018, NeurIPS.

[53]  Cheng-Zhong Xu,et al.  Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.

[54]  Zhiqiang Shen,et al.  Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).