Evolving Deep Convolutional Neural Networks by Variable-Length Particle Swarm Optimization for Image Classification

Convolutional neural networks (CNNs) are one of the most effective deep learning methods to solve image classification problems, but the best architecture of a CNN to solve a specific problem can be extremely complicated and hard to design. This paper focuses on utilising Particle Swarm Optimisation (PSO) to automatically search for the optimal architecture of CNNs without any manual work involved. In order to achieve the goal, three improvements are made based on traditional PSO. First, a novel encoding strategy inspired by computer networks which empowers particle vectors to easily encode CNN layers is proposed; Second, in order to allow the proposed method to learn variable-length CNN architectures, a Disabled layer is designed to hide some dimensions of the particle vector to achieve variable-length particles; Third, since the learning process on large data is slow, partial datasets are randomly picked for the evaluation to dramatically speed it up. The proposed algorithm is examined and compared with 12 existing algorithms including the state-of-art methods on three widely used image classification benchmark datasets. The experimental results show that the proposed algorithm is a strong competitor to the state-of-art algorithms in terms of classification error. This is the first work using PSO for automatically evolving the architectures of CNNs.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Jiwen Lu,et al.  PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[5]  Mengjie Zhang,et al.  Evolving Deep Convolutional Neural Networks for Image Classification , 2017, IEEE Transactions on Evolutionary Computation.

[6]  Dong Yu,et al.  Exploring convolutional neural network structures and optimization techniques for speech recognition , 2013, INTERSPEECH.

[7]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[8]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[9]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[10]  Quoc V. Le,et al.  Large-Scale Evolution of Image Classifiers , 2017, ICML.

[11]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[12]  Vince Fuller,et al.  Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy , 1993, RFC.

[13]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[14]  Stéphane Mallat,et al.  Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.

[15]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[16]  Mengjie Zhang,et al.  A Particle Swarm Optimization-Based Flexible Convolutional Autoencoder for Image Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Honglak Lee,et al.  Learning and Selecting Features Jointly with Point-wise Gated Boltzmann Machines , 2013, ICML.

[20]  Honglak Lee,et al.  Learning Invariant Representations with Local Transformations , 2012, ICML.

[21]  Andries Petrus Engelbrecht,et al.  A study of particle swarm optimization particle trajectories , 2006, Inf. Sci..

[22]  Zhang Yi,et al.  Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations , 2017, IEEE Transactions on Evolutionary Computation.