Assessing Hyper Parameter Optimization and Speedup for Convolutional Neural Networks

The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures.

[1]  Olatunji Ruwase,et al.  HyperDrive: exploring hyperparameters with POP scheduling , 2017, Middleware.

[2]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[3]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[4]  Ameet Talwalkar,et al.  Massively Parallel Hyperparameter Tuning , 2018, ArXiv.

[5]  Seung-Min Park,et al.  Optimal hyperparameter tuning of convolutional neural networks based on the parameter-setting-free harmony search algorithm , 2018, Optik.

[6]  Ashiq Anjum,et al.  Deep Learning Hyper-Parameter Optimization for Video Analytics in Clouds , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[7]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Rune Johan Borgli,et al.  Hyperparameter optimization using Bayesian optimization on transfer learning for medical image classification , 2018 .

[10]  Mohammad Rouhani,et al.  Lets keep it simple, Using simple architectures to outperform deeper and more complex architectures , 2016, ArXiv.

[11]  Jeevan Kanesan,et al.  Hyper‐parameters optimisation of deep CNN architecture for vehicle logo recognition , 2018, IET Intelligent Transport Systems.

[12]  Eugenio Culurciello,et al.  Flattened Convolutional Neural Networks for Feedforward Acceleration , 2014, ICLR.

[13]  Hideyuki Kobayashi,et al.  Feature Extraction of Video Using Artificial Neural Network , 2017, Int. J. Cogn. Informatics Nat. Intell..

[14]  Benjamin Ghansah,et al.  A Review of Deep Machine Learning , 2016, International Journal of Engineering Research in Africa.

[15]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[16]  Ulas Bagci,et al.  Automatically Designing CNN Architectures for Medical Image Segmentation , 2018, MLMI@MICCAI.

[17]  Taimoor Akhtar,et al.  Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates , 2016, AAAI.

[18]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[20]  Jorge Nocedal,et al.  On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.

[21]  A. M. Turing,et al.  Computing Machinery and Intelligence , 1950, The Philosophy of Artificial Intelligence.

[22]  Alexios Koutsoukas,et al.  Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data , 2017, Journal of Cheminformatics.

[23]  J. Tibbetts The Frontiers of Artificial IntelligenceDeep learning brings speed, accuracy to the life sciences. , 2018 .

[24]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[25]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[26]  Ausif Mahmood,et al.  Automated Optimal Architecture of Deep Convolutional Neural Networks for Image Recognition , 2016, 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA).

[27]  Stefan Wermter,et al.  Speeding up the Hyperparameter Optimization of Deep Convolutional Neural Networks , 2018, Int. J. Comput. Intell. Appl..

[28]  Dilip Patel,et al.  Hyper Parameters Selection for Image Classification in Convolutional Neural Networks , 2018, 2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC).

[29]  Masaki Onishi,et al.  Effective hyperparameter optimization using Nelder-Mead method in deep learning , 2017, IPSJ Transactions on Computer Vision and Applications.

[30]  Afef Abdelkrim,et al.  Convolutional Neural Network Hyper-Parameters Optimization based on Genetic Algorithms , 2018 .

[31]  Tony P. Pridmore,et al.  Towards Low-Cost Image-based Plant Phenotyping using Reduced-Parameter CNN , 2018, BMVC.

[32]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Heung-Il Suk,et al.  Deep Learning in Medical Image Analysis. , 2017, Annual review of biomedical engineering.

[34]  SchmidhuberJürgen Deep learning in neural networks , 2015 .

[35]  Hassan Foroosh,et al.  Factorized Convolutional Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[36]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[37]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[38]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..