Hyper-parameter selection in deep neural networks using parallel particle swarm optimization

The need of manual hyper-parameter selection can seriously hamper the model optimization of Deep Neural Networks (DNNs). Conventional automated approaches tackling this problem suffer from poor scalability or fail in certain scenarios. In this paper, we introduce a parallel method that applies Particle Swarm Optimization (PSO) for the hyper-parameter selection in DNNs. To estimate the best hyper-parameters, a population of particles is evolved, with their fitness calculated in parallel. The experimental results demonstrate very desirable scalability properties for different DNNs. We show that the parallel PSO can further optimize existent models designed by experts in an affordable amount of time.

[1]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[2]  Bart De Moor,et al.  Easy Hyperparameter Search Using Optunity , 2014, ArXiv.

[3]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[4]  Jakub Nalepa,et al.  Co-operation in the Parallel Memetic Algorithm , 2014, International Journal of Parallel Programming.

[5]  Una-May O'Reilly,et al.  Genetic Programming II: Automatic Discovery of Reusable Programs. , 1994, Artificial Life.

[6]  John R. Koza,et al.  Genetic programming 2 - automatic discovery of reusable programs , 1994, Complex Adaptive Systems.

[7]  Frank Hutter,et al.  CMA-ES for Hyperparameter Optimization of Deep Neural Networks , 2016, ArXiv.

[8]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[9]  ZhangJun,et al.  Distributed evolutionary algorithms and their models , 2015 .

[10]  Wei Zhang,et al.  Ecosystem particle swarm optimization , 2017, Soft Comput..

[11]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[12]  Qingfu Zhang,et al.  Distributed evolutionary algorithms and their models: A survey of the state-of-the-art , 2015, Appl. Soft Comput..

[13]  Taimoor Akhtar,et al.  Hyperparameter Optimization of Deep Neural Networks Using Non-Probabilistic RBF Surrogate Model , 2016, ArXiv.

[14]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[15]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[16]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[17]  Xin Yao,et al.  Many-Objective Evolutionary Algorithms , 2015, ACM Comput. Surv..

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Enrique Alba,et al.  A Multi-Objective Evolutionary Algorithm based on Parallel Coordinates , 2016, GECCO.

[20]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[21]  José Ranilla,et al.  Particle swarm optimization for hyper-parameter selection in deep neural networks , 2017, GECCO.

[22]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[23]  Bart De Moor,et al.  Hyperparameter Search in Machine Learning , 2015, ArXiv.

[24]  Enrique Alba,et al.  Parallelism and evolutionary algorithms , 2002, IEEE Trans. Evol. Comput..