One-Shot Neural Architecture Search via Compressive Sensing

Neural architecture search (NAS), or automated design of neural network models, remains a very challenging meta-learning problem. Several recent works (called "one-shot" approaches) have focused on dramatically reducing NAS running time by leveraging proxy models that still provide architectures with competitive performance. In our work, we propose a new meta-learning algorithm that we call CoNAS, or Compressive sensing-based Neural Architecture Search. Our approach merges ideas from one-shot approaches with iterative techniques for learning low-degree sparse Boolean polynomial functions. We validate our approach on several standard test datasets, discover novel architectures hitherto unreported, and achieve competitive (or better) results in both performance and search time compared to existing NAS approaches. Further, we support our algorithm with a theoretical analysis, providing upper bounds on the number of measurements needed to perform reliable meta-learning; to our knowledge, these analysis tools are novel to the NAS literature and may be of independent interest.

[1]  Fabio Maria Carlucci,et al.  NAS evaluation is frustratingly hard , 2019, ICLR.

[2]  Kaiming He,et al.  Exploring Randomly Wired Neural Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Martin Jaggi,et al.  Evaluating the Search Phase of Neural Architecture Search , 2019, ICLR.

[4]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[5]  Xiaofang Wang,et al.  Learnable Embedding Space for Efficient Neural Architecture Compression , 2019, ICLR.

[6]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[7]  Raquel Urtasun,et al.  Graph HyperNetworks for Neural Architecture Search , 2018, ICLR.

[8]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[9]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[10]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[11]  Qingquan Song,et al.  Auto-Keras: An Efficient Neural Architecture Search System , 2018, KDD.

[12]  Qingquan Song,et al.  Efficient Neural Architecture Search with Network Morphism , 2018, ArXiv.

[13]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[14]  Frank Hutter,et al.  Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution , 2018, ICLR.

[15]  Ameet Talwalkar,et al.  Massively Parallel Hyperparameter Tuning , 2018, ArXiv.

[16]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[17]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[18]  Takuya Akiba,et al.  Shakedrop Regularization for Deep Residual Learning , 2018, IEEE Access.

[19]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[20]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[21]  Ruslan Salakhutdinov,et al.  Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.

[22]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[23]  Theodore Lim,et al.  SMASH: One-Shot Model Architecture Search through HyperNetworks , 2017, ICLR.

[24]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[25]  Richard Socher,et al.  Regularizing and Optimizing LSTM Language Models , 2017, ICLR.

[26]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Yong Yu,et al.  Efficient Architecture Search by Network Transformation , 2017, AAAI.

[28]  Yang Yuan,et al.  Hyperparameter Optimization: A Spectral Approach , 2017, ICLR.

[29]  Geoffrey J. Gordon,et al.  DeepArchitect: Automatically Designing and Training Deep Architectures , 2017, ArXiv.

[30]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[31]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jürgen Schmidhuber,et al.  Recurrent Highway Networks , 2016, ICML.

[33]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[34]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Oded Regev,et al.  The Restricted Isometry Property of Subsampled Fourier Matrices , 2015, SODA.

[36]  Oded Regev,et al.  The List-Decoding Size of Fourier-Sparse Boolean Functions , 2015, TOCT.

[37]  Ameet Talwalkar,et al.  Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.

[38]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[39]  Holger Rauhut,et al.  A Mathematical Introduction to Compressive Sensing , 2013, Applied and Numerical Harmonic Analysis.

[40]  Venkatesan Guruswami,et al.  Restricted Isometry of Fourier Matrices and List Decodability of Random Linear Codes , 2012, SIAM J. Comput..

[41]  Andreas Krause,et al.  Learning Fourier Sparse Set Functions , 2012, AISTATS.

[42]  M. Rudelson,et al.  On sparse reconstruction from Fourier and Gaussian measurements , 2008 .

[43]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[44]  E. Candès The restricted isometry property and its implications for compressed sensing , 2008 .

[45]  Emmanuel J. Candès,et al.  Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies? , 2004, IEEE Transactions on Information Theory.

[46]  E. Candès,et al.  Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information , 2004, IEEE Transactions on Information Theory.

[47]  J. Bourgain An Improved Estimate in the Restricted Isometry Problem , 2014 .

[48]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[49]  Gitta Kutyniok Compressed Sensing , 2012 .

[50]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .