Coverage-Based Designs Improve Sample Mining and Hyperparameter Optimization

Sampling one or more effective solutions from large search spaces is a recurring idea in machine learning (ML), and sequential optimization has become a popular solution. Typical examples include data summarization, sample mining for predictive modeling, and hyperparameter optimization. Existing solutions attempt to adaptively trade off between global exploration and local exploitation, in which the initial exploratory sample is critical to their success. While discrepancy-based samples have become the de facto approach for exploration, results from computer graphics suggest that coverage-based designs, e.g., Poisson disk sampling, can be a superior alternative. In order to successfully adopt coverage-based sample designs to ML applications, which were originally developed for 2-D image analysis, we propose fundamental advances by constructing a parameterized family of designs with provably improved coverage characteristics and developing algorithms for effective sample synthesis. Using experiments in sample mining and hyperparameter optimization for supervised learning, we show that our approach consistently outperforms the existing exploratory sampling methods in both blind exploration and sequential search with Bayesian optimization.

[1]  Pramod K. Varshney,et al.  Stair blue noise sampling , 2016, ACM Trans. Graph..

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Ken G. Smith,et al.  The interplay between exploration and exploitation. , 2006 .

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Art B. Owen,et al.  Monte Carlo and Quasi-Monte Carlo for Statistics , 2009 .

[6]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[7]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[8]  Dimitrios Sarigiannis,et al.  Weighted Sampling for Combined Model Selection and Hyperparameter Tuning , 2020, AAAI.

[9]  Pramod K. Varshney,et al.  A Spectral Approach for the Design of Experiments: Design, Analysis and Algorithms , 2017, J. Mach. Learn. Res..

[10]  Willie Neiswanger,et al.  BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search , 2021, AAAI.

[11]  Lihong Li,et al.  A Bayesian Sampling Approach to Exploration in Reinforcement Learning , 2009, UAI.

[12]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[13]  Mohamed S. Ebeida,et al.  A Simple Algorithm for Maximal Poisson‐Disk Sampling in High Dimensions , 2012, Comput. Graph. Forum.

[14]  Mohamed S. Ebeida,et al.  Efficient maximal poisson-disk sampling , 2011, ACM Trans. Graph..

[15]  Stephan Mandt,et al.  Quasi-Monte Carlo Variational Inference , 2018, ICML.

[16]  Mohamed S. Ebeida,et al.  k-d Darts , 2013, ACM Trans. Graph..

[17]  Markus Gross,et al.  Analysis and synthesis of point distributions based on pair correlation , 2012, ACM Trans. Graph..

[18]  Kaiming He,et al.  Exploring Randomly Wired Neural Networks for Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Vikas Sindhwani,et al.  Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels , 2014, J. Mach. Learn. Res..

[21]  Jinglai Li,et al.  Bayesian optimization with local search , 2019, LOD.

[22]  Hans-Hermann Bock,et al.  Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data , 2000 .

[23]  Ameet Talwalkar,et al.  Random Search and Reproducibility for Neural Architecture Search , 2019, UAI.

[24]  Pramod K. Varshney,et al.  Theoretical guarantees for poisson disk sampling using pair correlation function , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  J. Yellott Spectral consequences of photoreceptor sampling in the rhesus retina. , 1983, Science.

[26]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[27]  Robert L. Cook,et al.  Stochastic sampling in computer graphics , 1988, TOGS.

[28]  Mark A. Z. Dippé,et al.  Antialiasing through stochastic sampling , 1985, SIGGRAPH.

[29]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[30]  Peter Richtárik,et al.  Importance Sampling for Minibatches , 2016, J. Mach. Learn. Res..

[31]  Christopher C. Drovandi,et al.  Improving the efficiency of fully Bayesian optimal design of experiments using randomised quasi-Monte Carlo , 2018 .

[32]  Pieter Abbeel,et al.  Interpretable and Pedagogical Examples , 2017, ArXiv.

[33]  Olivier Teytaud,et al.  Critical Hyper-Parameters: No Random, No Cry , 2017, ArXiv.

[34]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[35]  Oliver Deussen,et al.  Blue noise sampling with controlled aliasing , 2013, TOGS.

[36]  Frank Hutter,et al.  Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..

[37]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..