论文信息 - SI-VDNAS: Semi-Implicit Variational Dropout for Hierarchical One-shot Neural Architecture Search

SI-VDNAS: Semi-Implicit Variational Dropout for Hierarchical One-shot Neural Architecture Search

Bayesian methods have improved the interpretability and stability of neural architecture search (NAS). In this paper, we propose a novel probabilistic approach, namely Semi-Implicit Variational Dropout one-shot Neural Architecture Search (SI-VDNAS), that leverages semi-implicit variational dropout to support architecture search with variable operations and edges. SI-VDNAS achieves stable training that would not be affected by the over-selection of skip-connect operation. Experimental results demonstrate that SI-VDNAS finds a convergent architecture with only 2.7 MB parameters within 0.8 GPU-days and can achieve 2.60% top-1 error rate on CIFAR-10. The convergent architecture can obtain a top-1 error rate of 16.20% and 25.6% when transferred to CIFAR-100 and ImageNet (mobile setting).

Junni Zou | H. Xiong | Yaoming Wang | Wenrui Dai | Chenglin Li

[1] Xiangxiang Chu,et al. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2019, ECCV.

[2] Lingxi Xie,et al. Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters , 2019, ArXiv.

[3] Shifeng Zhang,et al. DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[4] Lingxi Xie,et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2019, ICLR.

[5] Rongrong Ji,et al. Multinomial Distribution Learning for Effective Neural Architecture Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6] Wei Pan,et al. BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[7] Gaofeng Meng,et al. Differentiable Architecture Search with Ensemble Gumbel-Softmax , 2019, ArXiv.

[8] Qi Tian,et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9] Lei Zhang,et al. Variational Bayesian Dropout With a Hierarchical Prior , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Liang Lin,et al. SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[11] Tie-Yan Liu,et al. Neural Architecture Optimization , 2018, NeurIPS.

[12] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[13] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[14] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[15] Frank Hutter,et al. Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[16] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[17] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[18] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[19] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[21] Quoc V. Le,et al. Large-Scale Evolution of Image Classifiers , 2017, ICML.

[22] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[23] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[24] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[28] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .