Neural Architecture Search in A Proxy Validation Loss Landscape

This paper searches for the optimal neural architecture by minimizing a proxy of validation loss. Existing neural architecture search (NAS) methods used to discover the optimal neural architecture that best fits the validation examples given the up-to-date network weights. However, back propagation with a number of validation examples could be time consuming, especially when it needs to be repeated many times in NAS. Though these intermediate validation results are invaluable, they would be wasted if we cannot use them to predict the future from the past. In this paper, we propose to approximate the validation loss landscape by learning a mapping from neural architectures to their corresponding validate losses. The optimal neural architecture thus can be easily identified as the minimum of this proxy validation loss landscape. A novel sampling strategy is further developed for an efficient approximation of the loss landscape. Theoretical analysis indicates that the validation loss estimator learnt with our sampling strategy can reach a lower error rate and a lower label complexity compared with a uniform sampling. Experimental results on benchmarks demonstrate that the architecture searched by the proposed algorithm can achieve a satisfactory accuracy with less time cost.

[1]  Jasper Snoek,et al.  Freeze-Thaw Bayesian Optimization , 2014, ArXiv.

[2]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[3]  Jun Wang,et al.  Reinforcement Learning for Architecture Search by Network Transformation , 2017, ArXiv.

[4]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[5]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[6]  Ramesh Raskar,et al.  Accelerating Neural Architecture Search using Performance Prediction , 2017, ICLR.

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Hiroaki Kitano,et al.  Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..

[9]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[10]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.

[11]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[12]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[14]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[15]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[16]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[17]  Quoc V. Le,et al.  Neural Optimizer Search with Reinforcement Learning , 2017, ICML.

[18]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[19]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Ludovic Denoyer,et al.  Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Elliot Meyerson,et al.  Evolving Deep Neural Networks , 2017, Artificial Intelligence in the Age of Neural Networks and Brain Computing.

[22]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[23]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[24]  Junjie Yan,et al.  Peephole: Predicting Network Performance Before Training , 2017, ArXiv.

[25]  Xiaopeng Zhang,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Differentiable Architecture Search , 2019, ArXiv.

[26]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Longhui Wei,et al.  Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters , 2019, ArXiv.

[29]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[30]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[31]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[32]  Aaron Klein,et al.  Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.