A Graph-Based Encoding for Evolutionary Convolutional Neural Network Architecture Design

Convolutional neural networks (CNNs) have demonstrated highly effective performance in image classification across a range of data sets. The best performance can only be obtained with CNNs when the appropriate architecture is chosen, which depends on both the volume and nature of the training data available. Many of the state-of-the-art architectures in the literature have been hand-crafted by human researchers, but this requires expertise in CNNs, domain knowledge, or trial-and-error experimentation, often using expensive resources. Recent work based on evolutionary deep learning has offered an alternative, in which evolutionary computation (EC) is applied to automatic architecture search. A key component in evolutionary deep learning is the chosen encoding strategy; however, previous approaches to CNN encoding in EC typically have restrictions in the architectures that can be represented. Here, we propose an encoding strategy based on a directed acyclic graph representation, and introduce an algorithm for random generation of CNN architectures using this encoding. In contrast to previous work, our proposed encoding method is more general, enabling representation of CNNs of arbitrary connectional structure and unbounded depth. We demonstrate its effectiveness using a random search, in which 200 randomly generated CNN architectures are evaluated. To improve the computational efficiency, the 200 CNNs are trained using only 10% of the CIFAR-10 training data; the three bestperforming CNNs are then re-trained on the full training set. The results show that the proposed representation and initialisation method can achieve promising accuracy compared to manually designed architectures, despite the simplicity of the random search approach and the reduced data set. We intend that future work can improve on these results by applying evolutionary search using this encoding.

[1]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[2]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[3]  Jiancheng Lv,et al.  Automatically Designing CNN Architectures Using Genetic Algorithm for Image Classification , 2018, ArXiv.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Masanori Suganuma,et al.  A genetic programming approach to designing convolutional neural network architectures , 2017, GECCO.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Mengjie Zhang,et al.  Evolving Deep Convolutional Neural Networks for Image Classification , 2017, IEEE Transactions on Evolutionary Computation.

[14]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[15]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[16]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[17]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Junjie Yan,et al.  Practical Network Blocks Design with Q-Learning , 2017, ArXiv.

[19]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[20]  Yinghui Xu,et al.  GP-CNAS: Convolutional Neural Network Architecture Search with Genetic Programming , 2018, ArXiv.

[21]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[22]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[23]  Alan L. Yuille,et al.  Genetic CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).