Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search

Computational models in fields such as computational neuroscience are often evaluated via stochastic simulation or numerical approximation. Fitting these models implies a difficult optimization problem over complex, possibly noisy parameter landscapes. Bayesian optimization (BO) has been successfully applied to solving expensive black-box problems in engineering and machine learning. Here we explore whether BO can be applied as a general tool for model fitting. First, we present a novel hybrid BO algorithm, Bayesian adaptive direct search (BADS), that achieves competitive performance with an affordable computational overhead for the running time of typical models. We then perform an extensive benchmark of BADS vs. many common and state-of-the-art nonconvex, derivative-free optimizers, on a set of model-fitting problems with real data and models from six studies in behavioral, cognitive, and computational neuroscience. With default settings, BADS consistently finds comparable or better solutions than other methods, including `vanilla' BO, showing great promise for advanced BO techniques, and BADS in particular, as a general model-fitting tool.

[1]  Luigi Acerbi,et al.  Bayesian comparison of explicit and implicit causal inference strategies in multisensory heading perception , 2017, bioRxiv.

[2]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[3]  Wei Ji Ma,et al.  Variability in encoding precision accounts for visual short-term memory limitations , 2012, Proceedings of the National Academy of Sciences.

[4]  John T Wixted,et al.  A direct test of the unequal-variance signal detection model of recognition memory , 2007, Psychonomic bulletin & review.

[5]  Dirk V. Arnold,et al.  Improving Evolution Strategies through Active Covariance Matrix Adaptation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[6]  Victor Picheny,et al.  Noisy kriging-based optimization methods: A unified implementation within the DiceOptim package , 2014, Comput. Stat. Data Anal..

[7]  D. Ginsbourger,et al.  A benchmark of kriging-based infill criteria for noisy optimization , 2013, Structural and Multidisciplinary Optimization.

[8]  Jasper Snoek,et al.  Input Warping for Bayesian Optimization of Non-Stationary Functions , 2014, ICML.

[9]  Nando de Freitas,et al.  Portfolio Allocation for Bayesian Optimization , 2010, UAI.

[10]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[11]  Stephen J. Wright,et al.  Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .

[12]  Wei Ji Ma,et al.  Human confidence reports account for sensory uncertainty but in a non-Bayesian way , 2016, bioRxiv.

[13]  A. Pouget,et al.  Behavior and neural basis of near-optimal visual search , 2011, Nature Neuroscience.

[14]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[15]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[16]  Jorge Nocedal,et al.  An interior algorithm for nonlinear optimization that combines line search and trust region steps , 2006, Math. Program..

[17]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[18]  Charles Audet,et al.  Erratum: Mesh Adaptive Direct Search Algorithms for Constrained Optimization , 2006, SIAM J. Optim..

[19]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[20]  Robert B. Gramacy,et al.  Cases for the nugget in modeling computer experiments , 2010, Statistics and Computing.

[21]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[22]  R. Shiffrin,et al.  A model for recognition memory: REM—retrieving effectively from memory , 1997, Psychonomic bulletin & review.

[23]  D. E. Goldberg,et al.  Genetic Algorithms in Search, Optimization & Machine Learning , 1989 .

[24]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[25]  Anne Auger,et al.  Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions , 2009 .

[26]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[27]  Philip E. Gill,et al.  Practical optimization , 1981 .

[28]  Wei Ji Ma,et al.  Does precision decrease with set size? , 2012, Journal of vision.

[29]  Tibor Csendes,et al.  Noname manuscript No. (will be inserted by the editor) The GLOBAL Optimization Method Revisited , 2022 .

[30]  Wei Ji Ma,et al.  Do People Think Like Computers? , 2016, Computers and Games.

[31]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[32]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[33]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[34]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[35]  Arnold Neumaier,et al.  SNOBFIT -- Stable Noisy Optimization by Branch and Fit , 2008, TOMS.

[36]  Herbert K. H. Lee,et al.  Bayesian Guided Pattern Search for Robust Local Optimization , 2009, Technometrics.

[37]  Ruben Martinez-Cantin,et al.  BayesOpt: a Bayesian optimization library for nonlinear optimization, experimental design and bandits , 2014, J. Mach. Learn. Res..

[38]  Konrad Paul Kording,et al.  Causal Inference in Multisensory Perception , 2007, PloS one.

[39]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[40]  J. Royston An Extension of Shapiro and Wilk's W Test for Normality to Large Samples , 1982 .

[41]  F. Clarke Optimization And Nonsmooth Analysis , 1983 .

[42]  Victor Picheny,et al.  Quantile-Based Optimization of Noisy Computer Experiments With Tunable Precision , 2013, Technometrics.

[43]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[44]  Eero P. Simoncelli,et al.  Origin and Function of Tuning Diversity in Macaque Visual Cortex , 2015, Neuron.

[45]  Paul Bratley,et al.  Algorithm 659: Implementing Sobol's quasirandom sequence generator , 1988, TOMS.

[46]  Wei Ji Ma,et al.  The computations underlying human confidence reports are probabilistic, but not Bayesian , 2016 .

[47]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[48]  Jeffrey C. Lagarias,et al.  Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions , 1998, SIAM J. Optim..

[49]  J. Weston,et al.  Approximation Methods for Gaussian Process Regression , 2007 .

[50]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[51]  Charles Audet,et al.  Mesh Adaptive Direct Search Algorithms for Constrained Optimization , 2006, SIAM J. Optim..

[52]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[53]  Ponnuthurai Nagaratnam Suganthan,et al.  Problem Definitions and Evaluation Criteria for the CEC 2014 Special Session and Competition on Single Objective Real-Parameter Numerical Optimization , 2014 .

[54]  Luigi Acerbi,et al.  Bayesian Comparison of Explicit and Implicit Causal Inference Strategies in Multisensory Heading Perception , 2017 .

[55]  Petros Koumoutsakos,et al.  A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[56]  Tamara G. Kolda,et al.  Optimization by Direct Search: New Perspectives on Some Classical and Modern Methods , 2003, SIAM Rev..

[57]  Aspen H. Yoo,et al.  Fechner’s Law in Metacognition: A Quantitative Model of Visual Working Memory Confidence , 2017, Psychological review.

[58]  Arnold Neumaier,et al.  Global Optimization by Multilevel Coordinate Search , 1999, J. Glob. Optim..

[59]  Sébastien Le Digabel,et al.  The mesh adaptive direct search algorithm with treed Gaussian process surrogates , 2011 .

[60]  Sébastien Le Digabel,et al.  Statistical Surrogate Formulations for Simulation-Based Design Optimization , 2015 .