ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

Albeit being a prevalent architecture searching approach, differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory. This is where the single-path DARTS comes in, which only chooses a single-path submodel at each step. While being memory-friendly, it also comes with low computational costs. Nonetheless, we discover a critical issue of single-path DARTS that has not been primarily noticed. Namely, it also suffers from severe performance collapse since too many parameter-free operations like skip connections are derived, just like DARTS does. In this paper, we propose a new algorithm called RObustifying Memory-Efficient NAS (ROME) to give a cure. First, we disentangle the topology search from the operation search to make searching and evaluation consistent. We then adopt Gumbel-Top2 reparameterization and gradient accumulation to robustify the unwieldy bi-level optimization. We verify ROME extensively across 15 benchmarks to demonstrate its effectiveness and robustness.

[1]  Junchi Yan,et al.  EAutoDet: Efficient Architecture Search for Object Detection , 2022, ECCV.

[2]  Weinan Zhang,et al.  DropNAS: Grouped Operation Dropout for Differentiable Architecture Search , 2022, ArXiv.

[3]  Junchi Yan,et al.  ZARTS: On Zero-order Optimization for Neural Architecture Search , 2021, NeurIPS.

[4]  Quanquan Li,et al.  Differentiable Dynamic Wirings for Neural Networks , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  K. H. Low,et al.  NASI: Label- and Data-agnostic Neural Architecture Search at Initialization , 2021, ICLR.

[6]  Cho-Jui Hsieh,et al.  Rethinking Architecture Selection in Differentiable NAS , 2021, ICLR.

[7]  Junchi Yan,et al.  Rethinking Bi-Level Optimization in Neural Architecture Search: A Gibbs Sampling Perspective , 2021, AAAI.

[8]  Fei Wang,et al.  ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding , 2020, NeurIPS.

[9]  Yi Yang,et al.  DOTS: Decoupling Operation and Topology in Differentiable Architecture Search , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Junchi Yan,et al.  DARTS-: Robustly Stepping out of Performance Collapse Without Indicators , 2020, ICLR.

[11]  Yonggang Hu,et al.  MergeNAS: Merge Operations into One for Differentiable Architecture Search , 2020, IJCAI.

[12]  R. Socher,et al.  Theory-Inspired Path-Regularized Differential Network Architecture Search , 2020, NeurIPS.

[13]  Yuandong Tian,et al.  Few-shot Neural Architecture Search , 2020, ICML.

[14]  Cho-Jui Hsieh,et al.  Stabilizing Differentiable Architecture Search via Perturbation-based Regularization , 2020, ICML.

[15]  Julien N. Siems,et al.  NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[16]  Fabio Maria Carlucci,et al.  NAS evaluation is frustratingly hard , 2019, ICLR.

[17]  Ali K. Thabet,et al.  SGAS: Sequential Greedy Architecture Search , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiangxiang Chu,et al.  Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2019, ECCV.

[19]  F. Hutter,et al.  Understanding and Robustifying Differentiable Architecture Search , 2019, ICLR.

[20]  Shifeng Zhang,et al.  DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[21]  Marius Lindauer,et al.  Best Practices for Scientific Research on Neural Architecture Search , 2019, ArXiv.

[22]  Lingxi Xie,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2019, ICLR.

[23]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[26]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jie Liu,et al.  Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours , 2019, ECML/PKDD.

[30]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[31]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[32]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[33]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[36]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[37]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[39]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[40]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[41]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[42]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[43]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[44]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[46]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[47]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[48]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Junchi Yan,et al.  A Max-Flow Based Approach for Neural Architecture Search , 2022, ECCV.

[51]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[52]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .