Hyper-parameter optimization of convolutional neural network based on particle swarm optimization algorithm

Deep neural networks have accomplished enormous progress in tackling many problems. More specifically, convolutional neural network (CNN) is a category of deep networks that have been a dominant technique in computer vision tasks. Despite that these deep neural networks are highly effective; the ideal structure is still an issue that needs a lot of investigation. Deep Convolutional Neural Network model is usually designed manually by trials and repeated tests which enormously constrain its application. Many hyper-parameters of the CNN can affect the model performance. These parameters are depth of the network, numbers of convolutional layers, and numbers of kernels with their sizes. Therefore, it may be a huge challenge to design an appropriate CNN model that uses optimized hyper-parameters and reduces the reliance on manual involvement and domain expertise. In this paper, a design architecture method for CNNs is proposed by utilization of particle swarm optimization (PSO) algorithm to learn the optimal CNN hyper-parameters values. In the experiment, we used Modified National Institute of Standards and Technology (MNIST) database of handwritten digit recognition. The experiments showed that our proposed approach can find an architecture that is competitive to the state-of-the-art models with a testing error of 0.87%.

[1]  Jiwen Lu,et al.  PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Ali Haidar,et al.  Particle Swarm Optimization Based Approach for Finding Optimal Values of Convolutional Neural Network Parameters , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[5]  A. M. Hasan,et al.  Convolutional neural networks framework for human hand gesture recognition , 2021 .

[6]  Yadnyesh Bhor,et al.  Optimization of Turning Parameters Using Taguchi’s Method and Artificial Bee Colony Algorithm , 2020 .

[7]  Wang Hu,et al.  Adaptive Multiobjective Particle Swarm Optimization Based on Parallel Cell Coordinate System , 2015, IEEE Transactions on Evolutionary Computation.

[8]  Afef Abdelkrim,et al.  Convolutional Neural Network Hyper-Parameters Optimization based on Genetic Algorithms , 2018 .

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gary G. Yen,et al.  Particle swarm optimization of deep neural networks architectures for image classification , 2019, Swarm Evol. Comput..

[11]  Sabyasachi Pattnaik,et al.  Fast Convergence Particle Swarm Optimization for Functions Optimization , 2012 .

[12]  Md. Ferdouse Ahmed Foysal,et al.  Convolutional Neural Network Hyper-Parameter Optimization Using Particle Swarm Optimization , 2021 .

[13]  Kiyoharu Aizawa,et al.  Efficient Optimization of Convolutional Neural Networks Using Particle Swarm Optimization , 2017, 2017 IEEE Third International Conference on Multimedia Big Data (BigMM).

[14]  Saqib Ali,et al.  Cloud-based efficient scheme for handwritten digit recognition , 2020, Multimedia Tools and Applications.

[15]  Chaitali G. Dhaware,et al.  Survey On Image Classification Methods In Image Processing , 2016 .

[16]  Richard K. G. Do,et al.  Convolutional neural networks: an overview and application in radiology , 2018, Insights into Imaging.

[17]  Frank Hutter,et al.  CMA-ES for Hyperparameter Optimization of Deep Neural Networks , 2016, ArXiv.

[18]  Taufiqotul Bariyah,et al.  Batik pattern recognition using convolutional neural network , 2020 .

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[22]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jian-Yu Li,et al.  Efficient Hyperparameter Optimization for Convolution Neural Networks in Deep Learning: A Distributed Particle Swarm Optimization Approach , 2020, Cybern. Syst..

[24]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ali Abdul Kadhim Taher,et al.  Improvement of genetic algorithm using artificial bee colony , 2020 .

[26]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[27]  David Orive,et al.  Evolutionary algorithms for hyperparameter tuning on neural networks models , 2014 .

[28]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[29]  Yulong Wang,et al.  cPSO-CNN: An efficient PSO-based algorithm for fine-tuning hyper-parameters of convolutional neural networks , 2019, Swarm Evol. Comput..