Prioritized Architecture Sampling with Monto-Carlo Tree Search

One-shot neural architecture search (NAS) methods significantly reduce the search cost by considering the whole search space as one network, which only needs to be trained once. However, current methods select each operation independently without considering previous layers. Besides, the historical information obtained with huge computation costs is usually used only once and then discarded. In this paper, we introduce a sampling strategy based on Monte Carlo tree search (MCTS) with the search space modeled as a Monte Carlo tree (MCT), which captures the dependency among layers. Furthermore, intermediate results are stored in the MCT for future decisions and a better exploration-exploitation balance. Concretely, MCT is updated using the training loss as a reward to the architecture performance; for accurately evaluating the numerous nodes, we propose node communication and hierarchical node selection methods in the training and search stages, respectively, making better uses of the operation rewards and hierarchical information. Moreover, for a fair comparison of different NAS methods, we construct an open-source NAS benchmark of a macro search space evaluated on CIFAR-10, namely NAS-Bench-Macro. Extensive experiments on NAS-Bench-Macro and ImageNet demonstrate that our method significantly improves search efficiency and performance. For example, by only searching 20 architectures, our obtained architecture achieves 78.0% top-1 accuracy with 442M FLOPs on ImageNet. Code (Benchmark) is available at: https://github.com/xiusu/NAS-Bench-Macro.

[1]  Yuandong Tian,et al.  Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search , 2020, NeurIPS.

[2]  Geoffrey J. Gordon,et al.  DeepArchitect: Automatically Designing and Training Deep Architectures , 2017, ArXiv.

[3]  Yuandong Tian,et al.  Neural Architecture Search Using Deep Neural Networks and Monte Carlo Tree Search , 2020, AAAI.

[4]  Chao Xu,et al.  Reborn Filters: Pruning Convolutional Neural Networks with Limited Data , 2020, AAAI.

[5]  Yuandong Tian,et al.  Sample-Efficient Neural Architecture Search by Learning Action Space , 2019, ArXiv.

[6]  Fei Wang,et al.  ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding , 2020, NeurIPS.

[7]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Chao Xu,et al.  A Semi-Supervised Assessor of Neural Architectures , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jianyuan Guo,et al.  GhostNet: More Features From Cheap Operations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Enhua Wu,et al.  Transformer in Transformer , 2021, NeurIPS.

[11]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[12]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[14]  Zhouchen Lin,et al.  Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Changshui Zhang,et al.  Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space , 2020, NeurIPS.

[16]  Tao Huang,et al.  GreedyNAS: Towards Fast One-Shot NAS With Greedy Supernet , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  W. Pirie Spearman Rank Correlation Coefficient , 2006 .

[18]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[19]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[20]  Bo Zhang,et al.  ScarletNAS: Bridging the Gap Between Scalability and Fairness in Neural Architecture Search , 2019, ArXiv.

[21]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[22]  Bo Zhang,et al.  FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search , 2019, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Bernard Ghanem,et al.  SGAS: Sequential Greedy Architecture Search , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Tao Huang,et al.  Explicitly Learning Topology for Differentiable Neural Architecture Search , 2020, ArXiv.

[25]  Chao Zhang,et al.  Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Dacheng Tao,et al.  Learning from Multiple Teacher Networks , 2017, KDD.

[27]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Lu Sheng,et al.  Powering One-shot Topological NAS with Stabilized Share-parameter Proxy , 2020, ECCV.

[29]  Fei Wang,et al.  Locally Free Weight Sharing for Network Width Search , 2021, ICLR.

[30]  Wei Pan,et al.  BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[31]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[32]  Fei Wang,et al.  PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[34]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[35]  Jingdong Wang,et al.  Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation , 2020, ECCV.

[36]  Qiang Wang,et al.  BETANAS: BalancEd TrAining and selective drop for Neural Architecture Search , 2019, ArXiv.