论文信息 - ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation

Single-path based differentiable neural architecture search has great strengths for its low computational cost and memory-friendly nature. However, we surprisingly discover that it suffers from severe searching instability which has been primarily ignored, posing a potential weakness for a wider application. In this paper, we delve into its performance collapse issue and propose a new algorithm called RObustifying Memory-Efficient NAS (ROME). Specifically, 1) for consistent topology in the search and evaluation stage, we involve separate parameters to disentangle the topology from the operations of the architecture. In such a way, we can independently sample connections and operations without interference; 2) to discount sampling unfairness and variance, we enforce fair sampling for weight update and apply a gradient accumulation mechanism for architecture parameters. Extensive experiments demonstrate that our proposed method has strong performance and robustness, where it mostly achieves state-of-the-art results on a large number of standard benchmarks.

[1] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[2] Xiaopeng Zhang,et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[3] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[4] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[6] Shifeng Zhang,et al. DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[7] Martin Jaggi,et al. Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[8] Frank Hutter,et al. NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[9] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[10] Qi Tian,et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Jie Liu,et al. Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours , 2019, ECML/PKDD.

[12] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[13] Bo Zhang,et al. FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[15] Bo Zhang,et al. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2020, ECCV.

[16] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[17] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Yi Yang,et al. Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Rongrong Ji,et al. Multinomial Distribution Learning for Effective Neural Architecture Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20] Aaron Klein,et al. NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[21] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[22] Thomas Brox,et al. Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[23] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[24] Bernard Ghanem,et al. SGAS: Sequential Greedy Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Xiangning Chen,et al. Stabilizing Differentiable Architecture Search via Perturbation-based Regularization , 2020, ICML.

[26] Terry L. Friesz,et al. Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..

[27] Quoc V. Le,et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[29] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[30] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[31] Liang Lin,et al. SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[32] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Yi Yang,et al. One-Shot Neural Architecture Search via Self-Evaluated Template Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[36] Pan Zhou,et al. Theory-Inspired Path-Regularized Differential Network Architecture Search , 2020, NeurIPS.

[37] Fabio Maria Carlucci,et al. NAS evaluation is frustratingly hard , 2020, ICLR.

[38] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[40] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[41] Yonggang Hu,et al. MergeNAS: Merge Operations into One for Differentiable Architecture Search , 2020, IJCAI.