A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment

Decisions often benefit from learned expectations about the sequential structure of the evidence. Here we show that individual differences in this learning process can reflect different implicit assumptions about sequence complexity, leading to performance trade-offs. For a task requiring decisions about dynamic evidence streams, human subjects with more flexible, history-dependent choices (low bias) had greater trial-to-trial choice variability (high variance). In contrast, subjects with more history-independent choices (high bias) were more predictable (low variance). We accounted for these behaviours using models in which assumed complexity was encoded by the size of the hypothesis space over the latent rate of change of the source of evidence. The most parsimonious model used an efficient sampling algorithm in which the range of sampled hypotheses represented an information bottleneck that gave rise to a bias–variance trade-off. This trade-off, which is well known in machine learning, may thus also have broad applicability to human decision-making.Glaze et al. show that individual variability in learning from noisy evidence involves a bias–variance trade-off that is best explained by a model using a sampling algorithm that approximates optimal inference.

[1]  S. Denéve,et al.  Neural processing as causal inference , 2011, Current Opinion in Neurobiology.

[2]  Aaron C. Courville,et al.  The pigeon as particle filter , 2007, NIPS 2007.

[3]  Vijay Balasubramanian,et al.  Statistical Inference, Occam's Razor, and Statistical Mechanics on the Space of Probability Distributions , 1996, Neural Computation.

[4]  Peter N. C. Mohr,et al.  Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions , 2009, Proceedings of the National Academy of Sciences.

[5]  Yohsuke R. Miyamoto,et al.  Temporal structure of motor variability is dynamically regulated and predicts motor learning ability , 2014, Nature Neuroscience.

[6]  Jonathan D. Cohen,et al.  Mechanisms underlying dependencies of performance on stimulus history in a two-alternative forced-choice task , 2002, Cognitive, affective & behavioral neuroscience.

[7]  Joshua I. Gold,et al.  A Mixture of Delta-Rules Approximation to Bayesian Inference in Change-Point Problems , 2013, PLoS Comput. Biol..

[8]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[9]  Rajesh P. N. Rao,et al.  Neurons as Monte Carlo Samplers: Bayesian Inference and Learning in Spiking Networks , 2014, NIPS.

[10]  Naftali Tishby,et al.  Past-future information bottleneck in dynamical systems. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[11]  Vijay Balasubramanian,et al.  A Geometric Formulation of Occam's Razor For Inference of Parametric Distributions , 1996, adap-org/9601001.

[12]  Yoshiyuki Sato,et al.  How much to trust the senses: likelihood learning. , 2014, Journal of vision.

[13]  Sophie Deneve,et al.  Making Decisions with Unknown Sensory Reliability , 2012, Front. Neurosci..

[14]  Timothy D. Hanks,et al.  Neural underpinnings of the evidence accumulator , 2016, Current Opinion in Neurobiology.

[15]  Ryan P. Adams,et al.  Bayesian Online Changepoint Detection , 2007, 0710.3742.

[16]  R. Duncan Luce,et al.  Response Times: Their Role in Inferring Elementary Mental Organization , 1986 .

[17]  Joseph W. Kable,et al.  Normative evidence accumulation in unpredictable environments , 2015, eLife.

[18]  Karl J. Friston The free-energy principle: a unified brain theory? , 2010, Nature Reviews Neuroscience.

[19]  Carlos Diuk,et al.  Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.

[20]  Charles Audet,et al.  Analysis of Generalized Pattern Searches , 2000, SIAM J. Optim..

[21]  M. Frank,et al.  Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. , 2012, Cerebral cortex.

[22]  Florent Meyniel,et al.  The Sense of Confidence during Probabilistic Learning: A Normative Account , 2015, PLoS Comput. Biol..

[23]  Karl J. Friston,et al.  Uncertainty in perception and the Hierarchical Gaussian Filter , 2014, Front. Hum. Neurosci..

[24]  Florent Meyniel,et al.  Human Inferences about Sequences: A Minimal Transition Probability Model , 2016, bioRxiv.

[25]  Robert C. Wilson,et al.  An Approximately Bayesian Delta-Rule Model Explains the Dynamics of Belief Updating in a Changing Environment , 2010, The Journal of Neuroscience.

[26]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[27]  A. Pouget,et al.  The Cost of Accumulating Evidence in Perceptual Decision Making , 2012, The Journal of Neuroscience.

[28]  M. Botvinick,et al.  Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective , 2009, Cognition.

[29]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[30]  Sophie Denève,et al.  Bayesian Spiking Neurons II: Learning , 2008, Neural Computation.

[31]  P. Fearnhead,et al.  On‐line inference for multiple changepoint problems , 2007 .

[32]  Karl J. Friston,et al.  Variational free energy and the Laplace approximation , 2007, NeuroImage.

[33]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[34]  Jonathan D. Cohen,et al.  Sequential effects: Superstition or rational behavior? , 2008, NIPS.

[35]  G. A. Barnard,et al.  Sequential Tests in Industrial Statistics , 1946 .

[36]  Maurice A. Smith,et al.  Environmental Consistency Determines the Rate of Motor Adaptation , 2014, Current Biology.

[37]  Joshua I. Gold,et al.  Bayesian Online Learning of the Hazard Rate in Change-Point Problems , 2010, Neural Computation.

[38]  Naomi Ehrich Leonard,et al.  Can Post-Error Dynamics Explain Sequential Reaction Time Patterns? , 2012, Front. Psychology.

[39]  Robert A. Legenstein,et al.  Ensembles of Spiking Neurons with Noise Support Optimal Probabilistic Inference in a Dynamically Changing Environment , 2014, PLoS Comput. Biol..

[40]  Naftali Tishby,et al.  Complexity through nonextensivity , 2001, physics/0103076.

[41]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[42]  Scott D. Brown,et al.  Detecting and predicting changes , 2009, Cognitive Psychology.

[43]  Donald Laming,et al.  Information theory of choice-reaction times , 1968 .

[44]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[45]  Michael J. Berry,et al.  Predictive information in a sensory population , 2013, Proceedings of the National Academy of Sciences.

[46]  Philip L. Smith,et al.  Psychology and neurobiology of simple decisions , 2004, Trends in Neurosciences.

[47]  Gerd Gigerenzer,et al.  Heuristic decision making. , 2011, Annual review of psychology.

[48]  Rajesh P. N. Rao Bayesian Computation in Recurrent Neural Circuits , 2004, Neural Computation.

[49]  Joseph W Kable,et al.  Normative evidence accumulation in unpredictable environments , 2015, eLife.

[50]  Karl J. Friston,et al.  A Bayesian Foundation for Individual Learning Under Uncertainty , 2011, Front. Hum. Neurosci..

[51]  Nicole M. Long,et al.  Supplemental Figure , 2011 .

[52]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[53]  J. Tenenbaum,et al.  Bayesian Special Section Learning Overhypotheses with Hierarchical Bayesian Models , 2022 .

[54]  Jonathan D. Cohen,et al.  The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.

[55]  Matthew H. Wilder,et al.  Sequential effects in response time reveal learning mechanisms and event representations. , 2013, Psychological review.

[56]  Thomas L. Griffiths,et al.  Neural Implementation of Hierarchical Bayesian Inference by Importance Sampling , 2009, NIPS.

[57]  Christian K. Machens,et al.  Predictive Coding of Dynamical Variables in Balanced Spiking Networks , 2013, PLoS Comput. Biol..

[58]  Joseph T. McGuire,et al.  A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.

[59]  He Huang,et al.  Sequential effects: A Bayesian analysis of prior bias on reaction time and behavioral choice , 2014, CogSci.

[60]  Joseph T. McGuire,et al.  Functionally Dissociable Influences on Learning Rate in a Dynamic Environment , 2014, Neuron.

[61]  M. Brainard,et al.  Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong , 2007, Nature.

[62]  M. Davison,et al.  The matching law: A research review. , 1988 .

[63]  Tai Sing Lee,et al.  Hierarchical Bayesian inference in the visual cortex. , 2003, Journal of the Optical Society of America. A, Optics, image science, and vision.

[64]  Marius Usher,et al.  The Timescale of Perceptual Evidence Integration Can Be Adapted to the Environment , 2013, Current Biology.

[65]  S. Kelly,et al.  The neural processes underlying perceptual decision making in humans: Recent progress and future directions , 2015, Journal of Physiology-Paris.

[66]  A. U.S.,et al.  Predictability , Complexity , and Learning , 2002 .

[67]  Zachary P. Kilpatrick,et al.  Stochastic models of evidence accumulation in changing environments , 2015, bioRxiv.

[68]  Schrater Paul Structure learning in human sequential decision-making , 2009 .

[69]  Naftali Tishby,et al.  The information bottleneck method , 2000, ArXiv.

[70]  W. H. Zurek Complexity, Entropy and the Physics of Information , 1990 .

[71]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[72]  Wolfgang Maass,et al.  Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons , 2011, PLoS Comput. Biol..

[73]  Zachary P. Kilpatrick,et al.  Evidence accumulation and change rate inference in dynamic environments , 2016, bioRxiv.