论文信息 - Angle-based Search Space Shrinking for Neural Architecture Search

Angle-based Search Space Shrinking for Neural Architecture Search

In this work, we present a simple and general search space shrinking method, called Angle-Based search space Shrinking (ABS), for Neural Architecture Search (NAS). Our approach progressively simplifies the original search space by dropping unpromising candidates, thus can reduce difficulties for existing NAS methods to find superior architectures. In particular, we propose an angle-based metric to guide the shrinking process. We provide comprehensive evidences showing that, in weight-sharing supernet, the proposed metric is more stable and accurate than accuracy-based and magnitude-based metrics to predict the capability of child models. We also show that the angle-based metric can converge fast while training supernet, enabling us to get promising shrunk search spaces efficiently. ABS can easily apply to most of NAS approaches (e.g. SPOS, FairNAS, ProxylessNAS, DARTS and PDARTS). Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.

[1] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[2] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[3] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[6] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[7] Qi Tian,et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8] Quoc V. Le,et al. Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[9] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[10] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[12] Yi Yang,et al. One-Shot Neural Architecture Search via Self-Evaluated Template Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Stéphane Pateux,et al. Efficient Progressive Neural Architecture Search , 2018, BMVC.

[14] Qi Tian,et al. Scalable NAS with Factorizable Architectural Parameters , 2019, ArXiv.

[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[17] Wei Wu,et al. Improving One-Shot NAS by Suppressing the Posterior Fading , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Jian Sun,et al. DetNAS: Backbone Search for Object Detection , 2019, NeurIPS.

[19] Yi Yang,et al. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[20] M. Kendall. A NEW MEASURE OF RANK CORRELATION , 1938 .

[21] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[22] Sanjeev Arora,et al. Theoretical Analysis of Auto Rate-Tuning by Batch Normalization , 2018, ICLR.

[23] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.

[24] George Adam,et al. Understanding Neural Architecture Search Techniques , 2019, ArXiv.

[25] Bo Zhang,et al. FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[26] Hang Xu,et al. Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] Lihi Zelnik-Manor,et al. ASAP: Architecture Search, Anneal and Prune , 2019, AISTATS.

[28] Chen Zhang,et al. Deeper Insights into Weight Sharing in Neural Architecture Search , 2020, ArXiv.

[29] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[31] Zhanxing Zhu,et al. Spherical Motion Dynamics of Deep Neural Networks with Batch Normalization and Weight Decay , 2020, ArXiv.

[32] Lihi Zelnik-Manor,et al. XNAS: Neural Architecture Search with Expert Advice , 2019, NeurIPS.

[33] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[35] Simon Carbonnelle,et al. Layer rotation: a surprisingly simple indicator of generalization in deep networks? , 2019 .

[36] Sanjeev Arora,et al. An Exponential Learning Rate Schedule for Deep Learning , 2020, ICLR.

[37] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[38] Yi Yang,et al. Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).