Reducing the Search Space for Hyperparameter Optimization Using Group Sparsity

We propose a new algorithm for hyperparameter selection in machine learning algorithms. The algorithm is a novel modification of Harmonica, a spectral hyperparameter selection approach using sparse recovery methods. In particular, we show that a special encoding of hyperparameter space enables a natural group-sparse recovery formulation, which when coupled with HyperBand (a multi-armed bandit strategy) leads to improvement over existing hyperparameter optimization methods such as Successive Halving and Random Search. Experimental results on image datasets such as CIFAR-10 confirm the benefits of our approach.

[1]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[2]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[3]  Taimoor Akhtar,et al.  Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates , 2016, AAAI.

[4]  Ameet Talwalkar,et al.  Non-stochastic Best Arm Identification and Hyperparameter Optimization , 2015, AISTATS.

[5]  Isabelle Bloch,et al.  Hyperparameter optimization of deep neural networks: combining Hperband with Bayesian model selection , 2017 .

[6]  T. Sanders,et al.  Analysis of Boolean Functions , 2012, ArXiv.

[7]  Mohammad Norouzi,et al.  Parallel Architecture and Hyperparameter Search via Successive Halving and Classification , 2018, ArXiv.

[8]  Tapani Raiko,et al.  Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters , 2015, ICML.

[9]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[10]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[11]  Kian Hsiang Low,et al.  DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks , 2016, IJCAI.

[12]  Katharina Eggensperger,et al.  Towards an Empirical Foundation for Assessing Bayesian Optimization of Hyperparameters , 2013 .

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  Paolo Frasconi,et al.  Forward and Reverse Gradient-Based Hyperparameter Optimization , 2017, ICML.

[15]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[16]  Yoshua Bengio,et al.  Gradient-Based Optimization of Hyperparameters , 2000, Neural Computation.

[17]  Aaron Klein,et al.  BOHB: Robust and Efficient Hyperparameter Optimization at Scale , 2018, ICML.

[18]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[19]  Jasper Snoek,et al.  Input Warping for Bayesian Optimization of Non-Stationary Functions , 2014, ICML.

[20]  Jason Xu,et al.  Combination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning , 2018, ArXiv.

[21]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[22]  Yang Yuan,et al.  Hyperparameter Optimization: A Spectral Approach , 2017, ICLR.