Stabilizing Differentiable Architecture Search via Perturbation-based Regularization

Differentiable architecture search (DARTS) is a prevailing NAS solution to identify architectures. Based on the continuous relaxation of the architecture space, DARTS learns a differentiable architecture weight and largely reduces the search cost. However, its stability has been challenged for yielding deteriorating architectures as the search proceeds. We find that the precipitous validation loss landscape, which leads to a dramatic performance drop when distilling the final architecture, is an essential factor that causes instability. Based on this observation, we propose a perturbation-based regularization - SmoothDARTS (SDARTS), to smooth the loss landscape and improve the generalizability of DARTS-based methods. In particular, our new formulations stabilize DARTS-based methods by either random smoothing or adversarial attack. The search trajectory on NAS-Bench-1Shot1 demonstrates the effectiveness of our approach and due to the improved stability, we achieve performance gain across various search spaces on 4 datasets. Furthermore, we mathematically show that SDARTS implicitly regularizes the Hessian norm of the validation loss, which accounts for a smoother loss landscape and improved performance.

[1]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[2]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[3]  Wei Pan,et al.  BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[4]  Zhanxing Zhu,et al.  Efficient Neural Architecture Search via Proximal Iterations , 2020, AAAI.

[5]  Elliot Meyerson,et al.  Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.

[6]  Lawrence Carin,et al.  Second-Order Adversarial Attack and Certifiable Robustness , 2018, ArXiv.

[7]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[8]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[9]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10]  J. Zico Kolter,et al.  Certified Adversarial Robustness via Randomized Smoothing , 2019, ICML.

[11]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Shifeng Zhang,et al.  DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[13]  Haishan Ye,et al.  MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Frank Hutter,et al.  NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[17]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[18]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[19]  Ramesh Raskar,et al.  Accelerating Neural Architecture Search using Performance Prediction , 2017, ICLR.

[20]  Bo Zhang,et al.  Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2020, ECCV.

[21]  Ruslan Salakhutdinov,et al.  Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.

[22]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[23]  Aaron Klein,et al.  Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.

[24]  Song Han,et al.  Path-Level Network Transformation for Efficient Architecture Search , 2018, ICML.

[25]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[26]  Wei Wu,et al.  Practical Block-Wise Neural Network Architecture Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[28]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[29]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[30]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[31]  Cho-Jui Hsieh,et al.  Rob-GAN: Generator, Discriminator, and Adversarial Attacker , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Frank Hutter,et al.  Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[33]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[34]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[36]  Xiangning Chen,et al.  Efficient Neural Interaction Function Search for Collaborative Filtering , 2020, WWW.

[37]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Xuanqing Liu,et al.  How Does Noise Help Robustness? Explanation and Exploration under the Neural SDE Framework , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[40]  Cho-Jui Hsieh,et al.  Towards Robust Neural Networks via Random Self-ensemble , 2017, ECCV.

[41]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Xiaopeng Zhang,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[43]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[44]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[46]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.

[47]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[48]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[49]  Quoc V. Le,et al.  Adversarial Examples Improve Image Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[51]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[52]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[53]  Junmo Kim,et al.  Deep Pyramidal Residual Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[55]  Shaofeng Cai,et al.  Understanding Architectures Learnt by Cell-based Neural Architecture Search , 2020, ICLR.

[56]  Suman Jana,et al.  Certified Robustness to Adversarial Examples with Differential Privacy , 2018, 2019 IEEE Symposium on Security and Privacy (SP).