Angle-based Search Space Shrinking for Neural Architecture Search

In this work, we present a simple and general search space shrinking method, called Angle-Based search space Shrinking (ABS), for Neural Architecture Search (NAS). Our approach progressively simplifies the original search space by dropping unpromising candidates, thus can reduce difficulties for existing NAS methods to find superior architectures. In particular, we propose an angle-based metric to guide the shrinking process. We provide comprehensive evidences showing that, in weight-sharing supernet, the proposed metric is more stable and accurate than accuracy-based and magnitude-based metrics to predict the capability of child models. We also show that the angle-based metric can converge fast while training supernet, enabling us to get promising shrunk search spaces efficiently. ABS can easily apply to most of NAS approaches (e.g. SPOS, FairNAS, ProxylessNAS, DARTS and PDARTS). Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.

[1]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Chen Zhang,et al.  Deeper Insights into Weight Sharing in Neural Architecture Search , 2020, ArXiv.

[3]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[4]  Quoc V. Le,et al.  Swish: a Self-Gated Activation Function , 2017, 1710.05941.

[5]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Qi Tian,et al.  Scalable NAS with Factorizable Architectural Parameters , 2019, ArXiv.

[7]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[8]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[9]  Sanjeev Arora,et al.  An Exponential Learning Rate Schedule for Deep Learning , 2020, ICLR.

[10]  George Adam,et al.  Understanding Neural Architecture Search Techniques , 2019, ArXiv.

[11]  Hang Xu,et al.  Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[13]  Wei Wu,et al.  Improving One-Shot NAS by Suppressing the Posterior Fading , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Quoc V. Le,et al.  Searching for Activation Functions , 2018, arXiv.

[17]  Sanjeev Arora,et al.  Theoretical Analysis of Auto Rate-Tuning by Batch Normalization , 2018, ICLR.

[18]  Lihi Zelnik-Manor,et al.  XNAS: Neural Architecture Search with Expert Advice , 2019, NeurIPS.

[19]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yi Yang,et al.  One-Shot Neural Architecture Search via Self-Evaluated Template Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Lihi Zelnik-Manor,et al.  ASAP: Architecture Search, Anneal and Prune , 2019, AISTATS.

[22]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[23]  Yi Yang,et al.  NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[24]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[25]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Zhanxing Zhu,et al.  Spherical Motion Dynamics of Deep Neural Networks with Batch Normalization and Weight Decay , 2020, ArXiv.

[28]  Jian Sun,et al.  DetNAS: Backbone Search for Object Detection , 2019, NeurIPS.

[29]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[30]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[31]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[32]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[33]  Stéphane Pateux,et al.  Efficient Progressive Neural Architecture Search , 2018, BMVC.

[34]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Simon Carbonnelle,et al.  Layer rotation: a surprisingly simple indicator of generalization in deep networks? , 2019 .

[37]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[38]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.