A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions

Deep learning has made major breakthroughs and progress in many fields. This is due to the powerful automatic representation capabilities of deep learning. It has been proved that the design of the network architecture is crucial to the feature representation of data and the final performance. In order to obtain a good feature representation of data, the researchers designed various complex network architectures. However, the design of the network architecture relies heavily on the researchers' prior knowledge and experience. Therefore, a natural idea is to reduce human intervention as much as possible and let the algorithm automatically design the architecture of the network. Thus going further to the strong intelligence. In recent years, a large number of related algorithms for \textit{Neural Architecture Search} (NAS) have emerged. They have made various improvements to the NAS algorithm, and the related research work is complicated and rich. In order to reduce the difficulty for beginners to conduct NAS-related research, a comprehensive and systematic survey on the NAS is essential. Previously related surveys began to classify existing work mainly from the basic components of NAS: search space, search strategy and evaluation strategy. This classification method is more intuitive, but it is difficult for readers to grasp the challenges and the landmark work in the middle. Therefore, in this survey, we provide a new perspective: starting with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then giving solutions for subsequent related research work. In addition, we conducted a detailed and comprehensive analysis, comparison and summary of these works. Finally, we give possible future research directions.

[1]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[2]  Frank Hutter,et al.  Multi-objective Architecture Search for CNNs , 2018, ArXiv.

[3]  Jingbo Zhu,et al.  Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition , 2019, EMNLP.

[4]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[5]  Martin Wistuba,et al.  A Survey on Neural Architecture Search , 2019, ArXiv.

[6]  Wei Wu,et al.  Practical Block-Wise Neural Network Architecture Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Jiashi Feng,et al.  Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Lihi Zelnik-Manor,et al.  XNAS: Neural Architecture Search with Expert Advice , 2019, NeurIPS.

[9]  Trung Le,et al.  MGAN: Training Generative Adversarial Nets with Multiple Generators , 2018, ICLR.

[10]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[11]  Fuzhen Zhuang,et al.  Deep Subdomain Adaptation Network for Image Classification , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Ching-Te Chiu,et al.  Real-Time Object Detection With Reduced Region Proposal Network via Multi-Feature Concatenation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Xiaopeng Zhang,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[14]  Martin Wistuba,et al.  Deep Learning Architecture Search by Neuro-Cell-Based Evolution with Function-Preserving Mutations , 2018, ECML/PKDD.

[15]  Lorenzo Torresani,et al.  MaskConnect: Connectivity Learning by Gradient Descent , 2018, ECCV.

[16]  Yuandong Tian,et al.  FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Geoffrey J. Gordon,et al.  DeepArchitect: Automatically Designing and Training Deep Architectures , 2017, ArXiv.

[18]  Dawn Xiaodong Song,et al.  Differentiable Neural Network Architecture Search , 2018, ICLR.

[19]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[20]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[21]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[22]  Jungong Han,et al.  Deep Attentive Video Summarization With Distribution Consistency Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ryan P. Adams,et al.  SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers , 2019, NeurIPS.

[25]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[26]  Rongrong Ji,et al.  Multinomial Distribution Learning for Effective Neural Architecture Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Xiangyu Zhang,et al.  DetNAS: Neural Architecture Search on Object Detection , 2019, ArXiv.

[28]  Alan L. Yuille,et al.  Genetic CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[30]  Shih-Fu Chang,et al.  ConvNet Architecture Search for Spatiotemporal Feature Learning , 2017, ArXiv.

[31]  Huiqi Li,et al.  Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Nicholas Rhinehart,et al.  N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning , 2017, ICLR.

[33]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Terry L. Friesz,et al.  Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..

[35]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[37]  Yi Yang,et al.  NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[38]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[39]  Ramesh Raskar,et al.  Accelerating Neural Architecture Search using Performance Prediction , 2017, ICLR.

[40]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Tao Mei,et al.  Customizable Architecture Search for Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Patrice Marcotte,et al.  An overview of bilevel optimization , 2007, Ann. Oper. Res..

[44]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Jian Sun,et al.  DetNAS: Backbone Search for Object Detection , 2019, NeurIPS.

[46]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[47]  Gaofeng Meng,et al.  DATA: Differentiable ArchiTecture Approximation , 2019, NeurIPS.

[48]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.

[49]  Xiaojun Chang,et al.  Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[51]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[53]  Michael S. Ryoo,et al.  Evolving Space-Time Neural Architectures for Videos , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[54]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[55]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[56]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[57]  Wei Pan,et al.  BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[58]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[59]  Qiang Wang,et al.  Adversarial AutoAugment , 2019, ICLR.

[60]  George Papandreou,et al.  Searching for Efficient Multi-Scale Architectures for Dense Image Prediction , 2018, NeurIPS.

[61]  Rui Xu,et al.  When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[63]  Dahua Lin,et al.  PolyNet: A Pursuit of Structural Diversity in Very Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[66]  Shuicheng Yan,et al.  Dual Path Networks , 2017, NIPS.

[67]  Aaron Klein,et al.  Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.

[68]  Ian R. Lane,et al.  Speeding up Hyper-parameter Optimization by Extrapolation of Learning Curves Using Previous Builds , 2017, ECML/PKDD.

[69]  Kalyanmoy Deb,et al.  NSGA-NET: A Multi-Objective Genetic Algorithm for Neural Architecture Search , 2018, ArXiv.

[70]  Song Han,et al.  Path-Level Network Transformation for Efficient Architecture Search , 2018, ICML.

[71]  Hao Chen,et al.  Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[73]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[74]  Yi Yang,et al.  Network Pruning via Transformable Architecture Search , 2019, NeurIPS.

[75]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[76]  Shiyu Chang,et al.  AutoGAN: Neural Architecture Search for Generative Adversarial Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[77]  Qian Zhang,et al.  Densely Connected Search Space for More Flexible Neural Architecture Search , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[79]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[80]  Min Sun,et al.  DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures , 2018, ECCV.

[81]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[82]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Yi Yang,et al.  One-Shot Neural Architecture Search via Self-Evaluated Template Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[84]  Wei Wu,et al.  Improving One-Shot NAS by Suppressing the Posterior Fading , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[85]  Wei Wu,et al.  Computation Reallocation for Object Detection , 2019, ICLR.

[86]  Michael S. Ryoo,et al.  AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures , 2019, ICLR.

[87]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[88]  Shao Tiefeng,et al.  Cocoon Image Segmentation Method Based on Fully Convolutional Networks , 2020 .

[89]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[90]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[91]  Thomas Brox,et al.  AutoDispNet: Improving Disparity Estimation With AutoML , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[92]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[93]  Yi Yang,et al.  NAS-Bench-102: Extending the Scope of Reproducible Neural Architecture Search , 2020 .

[94]  Shaofeng Cai,et al.  Understanding Architectures Learnt by Cell-based Neural Architecture Search , 2020, ICLR.

[95]  Yao Zhou,et al.  Evolutionary Compression of Deep Neural Networks for Biomedical Image Segmentation , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[96]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[97]  Ngai-Man Cheung,et al.  Dist-GAN: An Improved GAN Using Distance Constraints , 2018, ECCV.

[98]  Frank Hutter,et al.  Simple And Efficient Architecture Search for Convolutional Neural Networks , 2017, ICLR.

[99]  Niraj K. Jha,et al.  ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Tieniu Tan,et al.  Efficient Neural Architecture Transformation Searchin Channel-Level for Object Detection , 2019, NeurIPS.

[101]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[102]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[103]  Xavier Gastaldi,et al.  Shake-Shake regularization , 2017, ArXiv.

[104]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[105]  Xinggang Wang,et al.  Fast Neural Network Adaptation via Parameter Remapping and Architecture Search , 2020, ICLR.

[106]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[107]  Fabio Maria Carlucci,et al.  NAS evaluation is frustratingly hard , 2020, ICLR.

[108]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[109]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[110]  Junjie Yan,et al.  Peephole: Predicting Network Performance Before Training , 2017, ArXiv.

[111]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[112]  Bernard Ghanem,et al.  SGAS: Sequential Greedy Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[113]  Xiaofang Wang,et al.  Learnable Embedding Space for Efficient Neural Architecture Compression , 2019, ICLR.

[114]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[115]  Ameet Talwalkar,et al.  Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization , 2016, ICLR.

[116]  Jiaya Jia,et al.  Fast and Practical Neural Architecture Search , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[117]  Raquel Urtasun,et al.  Graph HyperNetworks for Neural Architecture Search , 2018, ICLR.

[118]  Jun Xie,et al.  Neural Machine Translation With GRU-Gated Attention Model , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[119]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[120]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[121]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[122]  Jakob Verbeek,et al.  Convolutional Neural Fabrics , 2016, NIPS.

[123]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[124]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[125]  Ramakanth Pasunuru,et al.  Continual and Multi-Task Architecture Search , 2019, ACL.

[126]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[127]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[128]  Yingwei Li,et al.  AtomNAS: Fine-Grained End-to-End Neural Architecture Search , 2020, ICLR.

[129]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[130]  Junwei Han,et al.  From Discriminant to Complete: Reinforcement Searching-Agent Learning for Weakly Supervised Object Detection , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[131]  Pouya Bashivan,et al.  Teacher Guided Architecture Search , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[132]  Masanori Suganuma,et al.  A genetic programming approach to designing convolutional neural network architectures , 2017, GECCO.

[133]  Yunyang Xiong,et al.  Resource Constrained Neural Network Architecture Search: Will a Submodularity Assumption Help? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[134]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[136]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[137]  Ankur Bapna,et al.  The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[138]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[139]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[140]  Tianqi Chen,et al.  Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.

[141]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[142]  Concetto Spampinato,et al.  MASK-RL: Multiagent Video Object Segmentation Framework Through Reinforcement Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[143]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[144]  Masakazu Iwamura,et al.  Deep Pyramidal Residual Networks with Separated Stochastic Depth , 2016, ArXiv.

[145]  Lorenzo Torresani,et al.  Connectivity Learning in Multi-Branch Networks , 2017, ArXiv.

[146]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[147]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[148]  Frank Hutter,et al.  Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[149]  Wei Wang,et al.  Improving MMD-GAN Training with Repulsive Loss Function , 2018, ICLR.

[150]  Ludovic Denoyer,et al.  Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[151]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.