Uncertainty quantification and exploration–exploitation trade-off in humans

The main objective of this paper is to outline a theoretical framework to analyse how humans’ decision-making strategies under uncertainty manage the trade-off between information gathering (exploration) and reward seeking (exploitation). A key observation, motivating this line of research, is the awareness that human learners are amazingly fast and effective at adapting to unfamiliar environments and incorporating upcoming knowledge: this is an intriguing behaviour for cognitive sciences as well as an important challenge for Machine Learning. The target problem considered is active learning in a black-box optimization task and more specifically how the exploration/exploitation dilemma can be modelled within Gaussian Process based Bayesian Optimization framework, which is in turn based on uncertainty quantification. The main contribution is to analyse humans’ decisions with respect to Pareto rationality where the two objectives are improvement expected and uncertainty quantification. According to this Pareto rationality model, if a decision set contains a Pareto efficient (dominant) strategy, a rational decision maker should always select the dominant strategy over its dominated alternatives. The distance from the Pareto frontier determines whether a choice is (Pareto) rational (i.e., lays on the frontier) or is associated to “exasperate” exploration. However, since the uncertainty is one of the two objectives defining the Pareto frontier, we have investigated three different uncertainty quantification measures and selected the one resulting more compliant with the Pareto rationality model proposed. The key result is an analytical framework to characterize how deviations from “rationality” depend on uncertainty quantifications and the evolution of the reward seeking process.

[1]  Matthew W. Hoffman,et al.  Predictive Entropy Search for Efficient Global Optimization of Black-box Functions , 2014, NIPS.

[2]  Kirthevasan Kandasamy,et al.  A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations , 2018, UAI.

[3]  P. Frazier Bayesian Optimization , 2018, Hyperparameter Optimization in Machine Learning.

[4]  Jonathan D. Cohen,et al.  Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.

[5]  Jonas Mockus,et al.  On Bayesian Methods for Seeking the Extremum , 1974, Optimization Techniques.

[6]  T. Bauer,et al.  Violations of dominance in decision-making , 2019, Business Research.

[7]  Raymond J. Dolan,et al.  The anatomy of choice: dopamine and decision-making , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[8]  Antonio Candelieri,et al.  Bayesian Optimization and Data Science , 2019, SpringerBriefs in Optimization.

[9]  Ichiro Takeuchi,et al.  Mean-Variance Analysis in Bayesian Optimization under Uncertainty , 2020, AISTATS.

[10]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[11]  Rahul Bhui,et al.  Structured, uncertainty-driven exploration in real-world consumer choice , 2019, Proceedings of the National Academy of Sciences.

[12]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[13]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[14]  M. Platt,et al.  Risky business: the neuroeconomics of decision making under uncertainty , 2008, Nature Neuroscience.

[15]  Adam Wierman,et al.  Thinking Fast and Slow , 2017, SIGMETRICS Perform. Evaluation Rev..

[16]  Bolei Zhou,et al.  Optimization as Estimation with Gaussian Processes in Bandit Settings , 2015, AISTATS.

[17]  Marc Peter Deisenroth,et al.  Efficiently sampling functions from Gaussian process posteriors , 2020, ICML.

[18]  Samuel J. Gershman,et al.  The algorithmic architecture of exploration in the human brain , 2019, Current Opinion in Neurobiology.

[19]  Svetha Venkatesh,et al.  Exploration Enhanced Expected Improvement for Bayesian Optimization , 2018, ECML/PKDD.

[20]  Willie Neiswanger,et al.  Uncertainty quantification using martingales for misspecified Gaussian processes , 2021, ALT.

[21]  Philipp Hennig,et al.  Entropy Search for Information-Efficient Global Optimization , 2011, J. Mach. Learn. Res..

[22]  Samuel Gershman,et al.  Dopamine, Inference, and Uncertainty , 2017, bioRxiv.

[23]  Francesco Archetti,et al.  Modelling human active search in optimizing black-box functions , 2020, Soft Computing.

[24]  Jonathan D. Nelson,et al.  Exploring the space of human exploration , 2019, bioRxiv.

[25]  S. Gershman Deconstructing the human algorithms for exploration , 2018, Cognition.

[26]  Udo von Toussaint,et al.  Global Optimization Employing Gaussian Process-Based Bayesian Surrogates† , 2018, Entropy.

[27]  Benjamin Van Roy,et al.  An Information-Theoretic Analysis of Thompson Sampling , 2014, J. Mach. Learn. Res..

[28]  Francesco Archetti,et al.  A New Evolutionary Approach to Optimal Sensor Placement in Water Distribution Networks , 2021, Water.

[29]  Andrew Gordon Wilson,et al.  The Human Kernel , 2015, NIPS.

[30]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[31]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[32]  S. Kakade,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2012, IEEE Transactions on Information Theory.

[33]  Li Liu,et al.  A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges , 2020, Inf. Fusion.

[34]  Jonathan P. How,et al.  Decision Making Under Uncertainty: Theory and Application , 2015 .

[35]  Zi Wang,et al.  Max-value Entropy Search for Efficient Bayesian Optimization , 2017, ICML.

[36]  Raphael T. Haftka,et al.  Fortified Test Functions for Global Optimization and the Power of Multiple Runs , 2019, 1912.10575.

[37]  R. Randles,et al.  On power and sample size determinations for the Wilcoxon–Mann–Whitney test , 2006 .

[38]  C. Tapiero,et al.  DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE , 2007 .

[39]  Jonathan D. Nelson,et al.  Generalization guides human exploration in vast decision spaces , 2017, Nature Human Behaviour.

[40]  J. Kruschke Bayesian approaches to associative learning: From passive to active learning , 2008, Learning & behavior.

[41]  Ben R. Newell,et al.  Learning and choosing in an uncertain world: An investigation of the explore–exploit dilemma in static and dynamic environments , 2016, Cognitive Psychology.

[42]  O. Peters The ergodicity problem in economics , 2019, Nature Physics.

[43]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[44]  Alberto Bemporad Global optimization via inverse distance weighting and radial basis functions , 2020, Comput. Optim. Appl..

[45]  M. Hudgens,et al.  Precise and accurate power of the rank-sum test for a continuous outcome , 2020, Journal of biopharmaceutical statistics.

[46]  Finale Doshi-Velez,et al.  Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning , 2017, ICML.

[47]  Stanley Zionts,et al.  Multiple Criteria Decision Making and Risk Analysis Using MicroComputers , 1989 .

[48]  Robert B. Gramacy,et al.  Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences , 2020 .

[49]  Jessica A. Cooper,et al.  A frontal dopamine system for reflective exploratory behavior , 2015, Neurobiology of Learning and Memory.

[50]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[51]  A. Kiureghian,et al.  Aleatory or epistemic? Does it matter? , 2009 .

[52]  James M. Calvin,et al.  Bi-objective decision making in global optimization based on statistical models , 2019, J. Glob. Optim..

[53]  Samuel J. Gershman,et al.  Believing in dopamine , 2019, Nature Reviews Neuroscience.

[54]  Oded Berger-Tal,et al.  The Exploration-Exploitation Dilemma: A Multidisciplinary Framework , 2014, PloS one.

[55]  Samuel J Gershman,et al.  Uncertainty and Exploration , 2018, bioRxiv.

[56]  Jingyu He,et al.  Efficient Sampling for Gaussian Linear Regression With Arbitrary Priors , 2018, Journal of Computational and Graphical Statistics.

[57]  Ali Borji,et al.  Bayesian optimization explains human active search , 2013, NIPS.

[58]  Charles Kemp,et al.  Bayesian models of cognition , 2008 .

[59]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[60]  D. Tzovaras,et al.  Using Activity-Related Behavioural Features towards More Effective Automatic Stress Detection , 2012, PloS one.

[61]  Robert C Wilson,et al.  Balancing exploration and exploitation with information and randomization , 2021, Current Opinion in Behavioral Sciences.

[62]  Jonathan E. Fieldsend,et al.  ϵ-shotgun: ϵ-greedy batch bayesian optimisation , 2020, GECCO.

[63]  Santu Rana,et al.  Randomised Gaussian Process Upper Confidence Bound for Bayesian Optimisation , 2020, International Joint Conference on Artificial Intelligence.

[64]  Jonathan D. Nelson,et al.  Emotion, entropy evaluations and subjective uncertainty , 2020 .

[65]  Andreas Krause,et al.  Active Learning for Multi-Objective Optimization , 2013, ICML.

[66]  A. Tversky,et al.  Rational choice and the framing of decisions , 1990 .

[67]  Joshua B. Tenenbaum,et al.  Assessing the Perceived Predictability of Functions , 2015, CogSci.

[68]  Aleksei Savatyugin,et al.  The History of Economic Analysis , 2002 .

[69]  Zi Wang,et al.  Batched Large-scale Bayesian Optimization in High-dimensional Spaces , 2017, AISTATS.