AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

Architecture performance predictors have been widely used in neural architecture search (NAS). Although they are shown to be simple and effective, the optimization objectives in previous arts (e.g., precise accuracy estimation or perfect ranking of all architectures in the space) did not capture the ranking nature of NAS. In addition, a large number of ground-truth architecture-accuracy pairs are usually required to build a reliable predictor, making the process too computationally expensive. To overcome these, in this paper, we look at NAS from a novel point of view and introduce Learning to Rank (LTR) methods to select the best (ace) architectures from a space. Specifically, we propose to use Normalized Discounted Cumulative Gain (NDCG) as the target metric and LambdaRank as the training algorithm. We also propose to leverage weak supervision from weight sharing by pretraining architecture representation on weak labels obtained from the super-net and then finetuning the ranking model using a small number of architectures trained from scratch. Extensive experiments on NAS benchmarks and large-scale search spaces demonstrate that our approach outperforms SOTA with a significantly reduced search cost.

[1]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[2]  Saining Xie,et al.  On Network Design Spaces for Visual Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Xuanjing Huang,et al.  Exploring Shared Structures and Hierarchies for Multiple NLP Tasks , 2018, ArXiv.

[4]  Chen Zhang,et al.  Deeper Insights into Weight Sharing in Neural Architecture Search , 2020, ArXiv.

[5]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[6]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[7]  Hanxiao Liu,et al.  Neural Predictor for Neural Architecture Search , 2019, ECCV.

[8]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[9]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[10]  Bo Chen,et al.  Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yu Wang,et al.  A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS , 2020, ECCV.

[12]  Chuang Gan,et al.  Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.

[13]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Mischa Schmidt,et al.  A Study of the Learning Progress in Neural Architecture Search Techniques , 2019, ArXiv.

[15]  Gregory N. Hullender,et al.  Learning to rank using gradient descent , 2005, ICML.

[16]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[17]  Yuning Jiang,et al.  MegDet: A Large Mini-Batch Object Detector , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Niraj K. Jha,et al.  ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[21]  George Adam,et al.  Understanding Neural Architecture Search Techniques , 2019, ArXiv.

[22]  Evangelos Kanoulas,et al.  Empirical justification of the gain and discount function for nDCG , 2009, CIKM.

[23]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[24]  Nicholas D. Lane,et al.  BRP-NAS: Prediction-based NAS using GCNs , 2020, NeurIPS.

[25]  Enhong Chen,et al.  Neural Architecture Search with GBDT , 2020, ArXiv.

[26]  Lu Yuan,et al.  Weak NAS Predictors Are All You Need , 2021, ArXiv.

[27]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[28]  Chen Wei,et al.  NPENAS: Neural Predictor Guided Evolution for Neural Architecture Search , 2020, ArXiv.

[29]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[30]  Tao Qin,et al.  FRank: a ranking method with fidelity loss , 2007, SIGIR.

[31]  Martin Jaggi,et al.  Overcoming Multi-Model Forgetting , 2019, ICML.

[32]  Yi Yang,et al.  NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[33]  Ramesh Raskar,et al.  Accelerating Neural Architecture Search using Performance Prediction , 2017, ICLR.

[34]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[35]  Fredric C. Gey,et al.  Probabilistic retrieval based on staged logistic regression , 1992, SIGIR '92.

[36]  Xiangyu Zhang,et al.  Single Path One-Shot Neural Architecture Search with Uniform Sampling , 2019, ECCV.

[37]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[38]  Harris Wu,et al.  Evaluating Web-based Question Answering Systems , 2002, LREC.

[39]  Quoc V. Le,et al.  The Evolved Transformer , 2019, ICML.

[40]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[41]  Junjie Yan,et al.  Peephole: Predicting Network Performance Before Training , 2017, ArXiv.

[42]  Yuandong Tian,et al.  FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function , 2020, ArXiv.

[43]  Kai Han,et al.  ReNAS: Relativistic Evaluation of Neural Architecture Search , 2019, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  James T. Kwok,et al.  Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS , 2020, NeurIPS.

[45]  Margret Keuper,et al.  NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search , 2020, ArXiv.

[46]  Mi Zhang,et al.  Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? , 2020, NeurIPS.

[47]  Tie-Yan Liu,et al.  A Theoretical Analysis of NDCG Type Ranking Measures , 2013, COLT.

[48]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[49]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[50]  Meng Li,et al.  NASGEM: Neural Architecture Search via Graph Embedding Method , 2020, AAAI.

[51]  Koby Crammer,et al.  Pranking with Ranking , 2001, NIPS.

[52]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[53]  Fabio Maria Carlucci,et al.  NAS evaluation is frustratingly hard , 2020, ICLR.

[54]  Wei Wu,et al.  Practical Block-Wise Neural Network Architecture Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Evgeny Burnaev,et al.  NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing , 2020, IEEE Access.

[56]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[57]  Chao Xu,et al.  A Semi-Supervised Assessor of Neural Architectures , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[59]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[60]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[61]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[62]  Jie Liu,et al.  Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours , 2019, ECML/PKDD.