StacNAS: Towards Stable and Consistent Differentiable Neural Architecture Search

Differentiable Neural Architecture Search algorithms such as DARTS have attracted much attention due to the low search cost and competitive accuracy. However, it has been observed that DARTS can be unstable, especially when applied to new problems. One cause of the instability is the difficulty of two-level optimization. In addition, we identify two other causes: (1) Multicollinearity of correlated/similar operations leads to unpredictable change of the architecture parameters during search; (2) The optimization complexity gap between the proxy search stage and the final training leads to suboptimal architectures. Based on these findings, we propose a two-stage grouped variable pruning algorithm using one-level optimization. In the first stage, the best group is activated, and in the second stage, the best operation in the activated group is selected. Extensive experiments verify the superiority of the proposed method both for accuracy and for stability. For the DARTS search space, the proposed strategy obtains state-of-the-art accuracies on CIFAR-10, CIFAR-100 and ImageNet. Code is available at this https URL.

[1]  Terry L. Friesz,et al.  Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..

[2]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[3]  Xavier Gastaldi,et al.  Shake-Shake regularization , 2017, ArXiv.

[4]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[5]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[7]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[8]  Lihi Zelnik-Manor,et al.  XNAS: Neural Architecture Search with Expert Advice , 2019, NeurIPS.

[9]  Jie Liu,et al.  Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours , 2019, ECML/PKDD.

[10]  Shifeng Zhang,et al.  DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[11]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[12]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[13]  Xiaopeng Zhang,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Differentiable Architecture Search , 2019, ArXiv.

[14]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[15]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[16]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Bo Zhang,et al.  ScarletNAS: Bridging the Gap Between Scalability and Fairness in Neural Architecture Search , 2019, ArXiv.

[18]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[19]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[20]  Zheng Xu,et al.  The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent , 2019, ICML.

[21]  F. Hutter,et al.  Understanding and Robustifying Differentiable Architecture Search , 2019, ICLR.

[22]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).