GNAS: A Generalized Neural Network Architecture Search Framework

In practice, the problems encountered in training NAS(Neural Architecture Search) are not simplex, but a series of combinations of difficulties are often faced(incorrect compensation estimation, curse of dimension, overfitting, high complexity, etc.). From the point of view for solving practical problems, this paper makes reference and improvement to the previous researches which only solve the single problem of NAS, and combines them into a practical technology flow. This paper propose a framework that decouples the network structure from the search space for operators. We use two BOHBs(Bayesian Optimization Hyperband) to search alternately in the vast network structure and operator search space. And then, we trained a GCN-baesd predictor using the feedback of the child model. This approach takes care of the dimension curse while improving efficiency. Considering that activation function and initialization are also important components of neural network, and can affect the generalization ability of the model. This paper introduced an activation function and an initialization method domain, join them to the operator search space to form a generalized search space, thus improving the generalization ability of the child model. At last, We applied our framework to neural architecture search and achieved significant improvements on multiple datasets.

[1]  Jakob Verbeek,et al.  Convolutional Neural Fabrics , 2016, NIPS.

[2]  Omkar M. Parkhi,et al.  VGGFace2: A Dataset for Recognising Faces across Pose and Age , 2017, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018).

[3]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[6]  Song Han,et al.  Path-Level Network Transformation for Efficient Architecture Search , 2018, ICML.

[7]  Noor H. Awad,et al.  Differential Evolution for Neural Architecture Search , 2020, ArXiv.

[8]  Junjie Yan,et al.  Peephole: Predicting Network Performance Before Training , 2017, ArXiv.

[9]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[10]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[11]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[12]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[13]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[14]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[16]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[18]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[19]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[21]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[22]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Marius Lindauer,et al.  Best Practices for Scientific Research on Neural Architecture Search , 2019, ArXiv.

[24]  Aaron Klein,et al.  Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search , 2018, ArXiv.

[25]  Xavier Gastaldi,et al.  Shake-Shake regularization , 2017, ArXiv.

[26]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[27]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[28]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[29]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[30]  Wei Wu,et al.  Practical Block-Wise Neural Network Architecture Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Song Han,et al.  APQ: Joint Search for Network Architecture, Pruning and Quantization Policy , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[33]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[36]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[37]  Yi Yang,et al.  NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[38]  Frank Hutter,et al.  NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search , 2020, ICLR.

[39]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Ludovic Denoyer,et al.  Learning Time-Efficient Deep Architectures with Budgeted Super Networks , 2017, ArXiv.

[41]  Anna Sergeevna Bosman,et al.  Evolutionary Neural Architecture Search for Image Restoration , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[42]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[43]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[44]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[45]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[46]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[48]  Yang Jiang,et al.  Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS , 2019, ArXiv.

[49]  Aaron Klein,et al.  Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization , 2019, ArXiv.