论文信息 - Few-shot Neural Architecture Search

Few-shot Neural Architecture Search

To improve the search efficiency for Neural Architecture Search (NAS), One-shot NAS proposes to train a single super-net to approximate the performance of proposal architectures during search via weight-sharing. While this greatly reduces the computation cost, due to approximation error, the performance prediction by a single super-net is less accurate than training each proposal architecture from scratch, leading to search inefficiency. In this work, we propose few-shot NAS that explores the choice of using multiple super-nets: each super-net is pre-trained to be in charge of a sub-region of the search space. This reduces the prediction error of each super-net. Moreover, training these super-nets can be done jointly via sequential fine-tuning. A natural choice of sub-region is to follow the splitting of search space in NAS. We empirically evaluate our approach on three different tasks in NAS-Bench-201. Extensive results have demonstrated that few-shot NAS, using only 5 super-nets, significantly improves performance of many search methods with slight increase of search time. The architectures found by DARTs and ENAS with few-shot models achieved 88.53% and 86.50% test accuracy on CIFAR-10 in NAS-Bench-201, significantly outperformed their one-shot counterparts (with 54.30% and 54.30% test accuracy). Moreover, on AUTOGAN and DARTS, few-shot NAS also outperforms previously state-of-the-art models.

Yuandong Tian | Rodrigo Fonseca | Tian Guo | Linnan Wang | Yiyang Zhao

[1] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[2] Enhong Chen,et al. Balanced One-shot Neural Architecture Optimization , 2019, 1909.10815.

[3] M. Kendall. A NEW MEASURE OF RANK CORRELATION , 1938 .

[4] Yuandong Tian,et al. Sample-Efficient Neural Architecture Search by Learning Action Space , 2019, ArXiv.

[5] Trung Le,et al. MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[6] Shiyu Chang,et al. AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[7] Martin Jaggi,et al. Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[8] Thomas Brox,et al. Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[9] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10] Aaron Klein,et al. NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[11] Simon Carbonnelle,et al. On layer-level control of DNN training and its impact on generalization , 2018, ArXiv.

[12] Quoc V. Le,et al. AutoAugment: Learning Augmentation Policies from Data , 2018, ArXiv.

[13] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[14] Ramesh Raskar,et al. Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[15] Aaron Klein,et al. BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[16] Lihi Zelnik-Manor,et al. XNAS: Neural Architecture Search with Expert Advice , 2019, NeurIPS.

[17] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[18] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[19] Jian Sun,et al. DetNAS: Backbone Search for Object Detection , 2019, NeurIPS.

[20] Hao He,et al. ProbGAN: Towards Probabilistic GAN with Theoretical Guarantees , 2018, ICLR.

[21] Wei Wang,et al. Improving MMD-GAN Training with Repulsive Loss Function , 2018, ICLR.

[22] Yiyang Zhao,et al. AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search , 2019, ArXiv.

[23] Ameet Talwalkar,et al. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[24] Frank Hutter,et al. NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[25] Sanjeev Arora,et al. An Exponential Learning Rate Schedule for Deep Learning , 2020, ICLR.

[26] Xiaopeng Zhang,et al. PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[27] Bo Zhang,et al. FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[28] Mathieu Salzmann,et al. How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS , 2020, ArXiv.

[29] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[30] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[31] Xiangyu Zhang,et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[32] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[33] Ann Bies,et al. The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[34] Seong Joon Oh,et al. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[36] Yi Yang,et al. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[37] Sanjeev Arora,et al. Theoretical Analysis of Auto Rate-Tuning by Batch Normalization , 2018, ICLR.

[38] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Xiangyu Zhang,et al. Angle-based Search Space Shrinking for Neural Architecture Search , 2020, ECCV.

[40] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[41] Yi Yang,et al. One-Shot Neural Architecture Search via Self-Evaluated Template Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43] Tie-Yan Liu,et al. Neural Architecture Optimization , 2018, NeurIPS.

[44] Yuandong Tian,et al. Neural Architecture Search Using Deep Neural Networks and Monte Carlo Tree Search , 2020, AAAI.

[45] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.