Smooth Variational Graph Embeddings for Efficient Neural Architecture Search

In this paper, we propose an approach to neural architecture search (NAS) based on graph embeddings. NAS has been addressed previously using discrete, sampling based methods, which are computationally expensive as well as differentiable approaches, which come at lower costs but enforce stronger constraints on the search space. The proposed approach leverages advantages from both sides by building a smooth variational neural architecture embedding space in which we evaluate a structural subset of architectures at training time using the predicted performance while it allows to extrapolate from this subspace at inference time. We evaluate the proposed approach in the context of two common search spaces, the graph structure defined by the ENAS approach and the NAS-Bench-101 search space, and improve over the state of the art in both.

[1]  Aaron Klein,et al.  NAS-Bench-101: Towards Reproducible Neural Architecture Search , 2019, ICML.

[2]  James T. Kwok,et al.  Multi-objective Neural Architecture Search via Predictive Network Performance Optimization , 2019, ArXiv.

[3]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[4]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[6]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  James T. Kwok,et al.  Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS , 2020, NeurIPS.

[9]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[10]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[11]  Xiaowen Dong,et al.  Neural Architecture Search using Bayesian Optimisation with Weisfeiler-Lehman Kernel , 2020, ArXiv.

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Margret Keuper,et al.  NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search , 2020, ArXiv.

[14]  Li Fei-Fei,et al.  Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Mi Zhang,et al.  Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? , 2020, NeurIPS.

[16]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[17]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[18]  Quoc V. Le,et al.  Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.

[19]  Chao Xu,et al.  A Semi-Supervised Assessor of Neural Architectures , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Chris Ying Enumerating Unique Computational Graphs via an Iterative Graph Invariant , 2019, ArXiv.

[21]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Frank Hutter,et al.  Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves , 2015, IJCAI.

[24]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[25]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[26]  Frank Hutter,et al.  A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets , 2017, ArXiv.

[27]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[28]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[29]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[30]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[31]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[32]  Zhi-Li Zhang,et al.  Graph Capsule Convolutional Neural Networks , 2018, ArXiv.

[33]  Yash Savani,et al.  Local Search is State of the Art for NAS Benchmarks , 2020, ArXiv.

[34]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[35]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[36]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[37]  Yanning Zhang,et al.  Performance Prediction Based on Neural Architecture Features , 2019, 2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI).

[38]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[39]  Heiner Stuckenschmidt,et al.  Neural Architecture Performance Prediction Using Graph Neural Networks , 2020, GCPR.

[40]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[41]  Aaron Klein,et al.  Learning Curve Prediction with Bayesian Neural Networks , 2016, ICLR.

[42]  Matt J. Kusner,et al.  Grammar Variational Autoencoder , 2017, ICML.

[43]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[44]  Jure Leskovec,et al.  GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models , 2018, ICML.

[45]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[46]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[47]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[48]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[49]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[50]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[51]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[52]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[54]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[55]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[56]  Willie Neiswanger,et al.  BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search , 2021, AAAI.

[57]  Yi Yang,et al.  NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search , 2020, ICLR.

[58]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[59]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[61]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[62]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[63]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[64]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[65]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[66]  Thomas Brox,et al.  AutoDispNet: Improving Disparity Estimation With AutoML , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[67]  Louis B. Rall,et al.  Automatic differentiation , 1981 .

[68]  Roman Garnett,et al.  D-VAE: A Variational Autoencoder for Directed Acyclic Graphs , 2019, NeurIPS.

[69]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[70]  Gunnar Rätsch,et al.  When crowds hold privileges: Bayesian unsupervised representation learning with oracle constraints , 2015, ICLR.

[71]  Yu Wang,et al.  A Generic Graph-based Neural Architecture Encoding Scheme for Predictor-based NAS , 2020, ECCV.

[72]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[74]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[75]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[76]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.