论文信息 - ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

Albeit being a prevalent architecture searching approach, differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory. This is where the single-path DARTS comes in, which only chooses a single-path submodel at each step. While being memory-friendly, it also comes with low computational costs. Nonetheless, we discover a critical issue of single-path DARTS that has not been primarily noticed. Namely, it also suffers from severe performance collapse since too many parameter-free operations like skip connections are derived, just like DARTS does. In this paper, we propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure. First, we disentangle the topology search from the operation search to make searching and evaluation consistent. We then adopt Gumbel-Top2 reparameterization and gradient accumulation to robustify the unwieldy bi-level optimization. We verify ROME extensively across 15 benchmarks to demonstrate its effectiveness and robustness.

[1] Junchi Yan,et al. EAutoDet: Efficient Architecture Search for Object Detection , 2022, ECCV.

[2] Weinan Zhang,et al. DropNAS: Grouped Operation Dropout for Differentiable Architecture Search , 2022, ArXiv.

[3] Junchi Yan,et al. ZARTS: On Zero-order Optimization for Neural Architecture Search , 2021, NeurIPS.

[4] Quanquan Li,et al. Differentiable Dynamic Wirings for Neural Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] K. H. Low,et al. NASI: Label- and Data-agnostic Neural Architecture Search at Initialization , 2021, ICLR.

[6] Cho-Jui Hsieh,et al. Rethinking Architecture Selection in Differentiable NAS , 2021, ICLR.

[7] Junchi Yan,et al. Rethinking Bi-Level Optimization in Neural Architecture Search: A Gibbs Sampling Perspective , 2021, AAAI.

[8] Fei Wang,et al. ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding , 2020, NeurIPS.

[9] Yi Yang,et al. DOTS: Decoupling Operation and Topology in Differentiable Architecture Search , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Junchi Yan,et al. DARTS-: Robustly Stepping out of Performance Collapse Without Indicators , 2020, ICLR.

[11] Yonggang Hu,et al. MergeNAS: Merge Operations into One for Differentiable Architecture Search , 2020, IJCAI.

[12] R. Socher,et al. Theory-Inspired Path-Regularized Differential Network Architecture Search , 2020, NeurIPS.

[13] Yuandong Tian,et al. Few-shot Neural Architecture Search , 2020, ICML.

[14] Cho-Jui Hsieh,et al. Stabilizing Differentiable Architecture Search via Perturbation-based Regularization , 2020, ICML.

[15] Julien N. Siems,et al. NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[16] Fabio Maria Carlucci,et al. NAS evaluation is frustratingly hard , 2019, ICLR.

[17] Ali K. Thabet,et al. SGAS: Sequential Greedy Architecture Search , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Xiangxiang Chu,et al. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2019, ECCV.

[19] F. Hutter,et al. Understanding and Robustifying Differentiable Architecture Search , 2019, ICLR.

[20] Shifeng Zhang,et al. DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[21] Marius Lindauer,et al. Best Practices for Scientific Research on Neural Architecture Search , 2019, ArXiv.

[22] Lingxi Xie,et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2019, ICLR.

[23] Bo Zhang,et al. FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24] Yi Yang,et al. Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[26] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27] Qi Tian,et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Quoc V. Le,et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Jie Liu,et al. Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours , 2019, ECML/PKDD.

[30] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[31] Aaron Klein,et al. NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[32] Martin Jaggi,et al. Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[33] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[36] Liang Lin,et al. SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[37] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[39] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[40] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[41] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[42] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[43] Theodore Lim,et al. SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[44] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[46] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[47] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[48] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50] Junchi Yan,et al. A Max-Flow Based Approach for Neural Architecture Search , 2022, ECCV.

[51] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[52] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .