论文信息 - Neural Architecture Search as Sparse Supernet

Neural Architecture Search as Sparse Supernet

This paper aims at enlarging the problem of Neural Architecture Search from Single-Path and Multi-Path Search to automated Mixed-Path Search. In particular, we model the new problem as a sparse supernet with a new continuous architecture representation using a mixture of sparsity constraints, i.e., Sparse Group Lasso. The sparse supernet is expected to automatically achieve sparsely-mixed paths upon a compact set of nodes. To optimize the proposed sparse supernet, we exploit a hierarchical accelerated proximal gradient algorithm within a bi-level optimization framework. Extensive experiments on CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet demonstrate that the proposed methodology is capable of searching for compact, general and powerful neural architectures.

[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.

[3] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[4] Bo Zhang,et al. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2020, ECCV.

[5] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Danilo Comminiello,et al. Group sparse regularization for deep neural networks , 2016, Neurocomputing.

[7] Jun Wang,et al. Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search , 2020, ECCV.

[8] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .

[9] Hisashi Kashima,et al. Fast Sparse Group Lasso , 2019, NeurIPS.

[10] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Chinmay Hegde,et al. One-Shot Neural Architecture Search via Compressive Sensing , 2019, ArXiv.

[12] Xiaopeng Zhang,et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[13] Qian Zhang,et al. FasterSeg: Searching for Faster Real-time Semantic Segmentation , 2019, ICLR.

[14] Tie-Yan Liu,et al. Neural Architecture Optimization , 2018, NeurIPS.

[15] Gaofeng Meng,et al. RENAS: Reinforced Evolutionary Neural Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[17] Shiyu Chang,et al. AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .

[20] Oriol Vinyals,et al. Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[21] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[22] Gaofeng Meng,et al. DATA: Differentiable ArchiTecture Approximation , 2019, NeurIPS.

[23] Mathieu Salzmann,et al. Learning the Number of Neurons in Deep Networks , 2016, NIPS.

[24] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Naiyan Wang,et al. You Only Search Once: Single Shot Neural Architecture Search via Direct Sparse Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[28] James Zijun Wang,et al. Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers , 2018, ICLR.

[29] Bernard Ghanem,et al. SGAS: Sequential Greedy Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Li Fei-Fei,et al. Progressive Neural Architecture Search , 2017, ECCV.

[31] Noah Simon,et al. A Sparse-Group Lasso , 2013 .

[32] Fevzi Alimo. Methods of Combining Multiple Classiiers Based on Diierent Representations for Pen-based Handwritten Digit Recognition , 1996 .

[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[34] Liang Lin,et al. SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[35] Rongrong Ji,et al. Multinomial Distribution Learning for Effective Neural Architecture Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36] Julien Mairal,et al. Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[37] Luc Van Gool,et al. Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Tao Huang,et al. GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Shifeng Zhang,et al. DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[40] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[41] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Naiyan Wang,et al. Data-Driven Sparse Structure Selection for Deep Neural Networks , 2017, ECCV.

[44] Yi Lu,et al. MixPath: A Unified Approach for One-shot Neural Architecture Search , 2020, ArXiv.

[45] P. Zhao,et al. The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[46] Wei Pan,et al. BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[47] Jun Wu,et al. Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild , 2019, International Journal of Computer Vision.

[48] Qi Tian,et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.

[50] Longhui Wei,et al. GOLD-NAS: Gradual, One-Level, Differentiable , 2020, ArXiv.

[51] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[52] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[53] F. Bach,et al. Optimization with Sparsity-Inducing Penalties (Foundations and Trends(R) in Machine Learning) , 2011 .