Computation noise promotes cognitive resilience to adverse conditions during decision-making

Random noise in information processing systems is widely seen as detrimental to function. But despite the large trial-to-trial variability of neural activity and behavior, humans and other animals show a remarkable adaptability to unexpected adverse events occurring during task execution. This cognitive ability, described as constitutive of general intelligence, is missing from current artificial intelligence (AI) systems which feature exact (noise-free) computations. Here we show that implementing computation noise in recurrent neural networks boosts their cognitive resilience to a variety of adverse conditions entirely unseen during training, in a way that resembles human and animal cognition. In contrast to artificial agents with exact computations, noisy agents exhibit hallmarks of Bayesian inference acquired in a ‘zero-shot’ fashion – without prior experience with conditions that require these computations for maximizing rewards. We further demonstrate that these cognitive benefits result from free-standing regularization of activity patterns in noisy neural networks. Together, these findings suggest that intelligence may ride on computation noise to promote near-optimal decision-making in adverse conditions without any engineered cognitive sophistication.

[1]  Jan Drugowitsch,et al.  Computational Precision of Mental Inference as Critical Source of Human Choice Suboptimality , 2016, Neuron.

[2]  J. DiCarlo,et al.  Using goal-driven deep learning models to understand sensory cortex , 2016, Nature Neuroscience.

[3]  Valentin Wyart,et al.  Choice variability and suboptimality in uncertain environments , 2016, Current Opinion in Behavioral Sciences.

[4]  Daniel Jurafsky,et al.  A Probabilistic Model of Lexical and Syntactic Access and Disambiguation , 1996, Cogn. Sci..

[5]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[6]  E. Bizzi,et al.  A theory for how sensorimotor skills are learned and retained in noisy and nonstationary neural circuits , 2013, Proceedings of the National Academy of Sciences.

[7]  Konrad Paul Kording,et al.  Bayesian integration in sensorimotor learning , 2004, Nature.

[8]  P. Roelfsema,et al.  The threshold for conscious report: Signal loss and response bias in visual and frontal cortex , 2018, Science.

[9]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[10]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[11]  L. Pinneo On noise in the nervous system. , 1966, Psychological review.

[12]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[13]  Gary Marcus,et al.  The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence , 2020, ArXiv.

[14]  Jing Peng,et al.  Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .

[15]  Surya Ganguli,et al.  A deep learning framework for neuroscience , 2019, Nature Neuroscience.

[16]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[17]  Rahul Bhui,et al.  Structured, uncertainty-driven exploration in real-world consumer choice , 2019, Proceedings of the National Academy of Sciences.

[18]  R. Selten,et al.  Bounded rationality: The adaptive toolbox , 2000 .

[19]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[20]  José Hernández-Orallo,et al.  The Measure of All Minds: Evaluating Natural and Artificial Intelligence , 2017 .

[21]  Xue-Xin Wei,et al.  A Bayesian observer model constrained by efficient coding can explain 'anti-Bayesian' percepts , 2015, Nature Neuroscience.

[22]  W. Newsome,et al.  Context-dependent computation by recurrent dynamics in prefrontal cortex , 2013, Nature.

[23]  Alexei A. Efros,et al.  Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Joel Z. Leibo,et al.  Prefrontal cortex as a meta-reinforcement learning system , 2018, bioRxiv.

[25]  Michael Woodford,et al.  Efficient coding of subjective value , 2018, Nature Neuroscience.

[26]  Xiao-Jing Wang,et al.  Synaptic computation underlying probabilistic inference , 2010, Nature Neuroscience.

[27]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[28]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[29]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[30]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[31]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[32]  Michael N. Shadlen,et al.  Probabilistic reasoning by neurons , 2007, Nature.

[33]  Pierre-Yves Oudeyer,et al.  Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress , 2012, NIPS.

[34]  Gary Marcus,et al.  Deep Learning: A Critical Appraisal , 2018, ArXiv.

[35]  A. Pouget,et al.  Not Noisy, Just Wrong: The Role of Suboptimal Inference in Behavioral Variability , 2012, Neuron.

[36]  Shane Legg,et al.  Noisy Networks for Exploration , 2017, ICLR.

[37]  Peter L. Bartlett,et al.  RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.

[38]  Julian Togelius,et al.  Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation , 2018, 1806.10729.

[39]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[40]  Xiao-Jing Wang,et al.  A Recurrent Network Mechanism of Time Integration in Perceptual Decisions , 2006, The Journal of Neuroscience.

[41]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[42]  Xiao-Jing Wang,et al.  Task representations in neural networks trained to perform many cognitive tasks , 2019, Nature Neuroscience.

[43]  A. Pouget,et al.  Neural correlations, population coding and computation , 2006, Nature Reviews Neuroscience.

[44]  Joseph W Kable,et al.  Normative evidence accumulation in unpredictable environments , 2015, eLife.

[45]  Jennifer A. Mangels,et al.  A Neostriatal Habit Learning System in Humans , 1996, Science.

[46]  Nicolas Chopin,et al.  Imprecise neural computations as source of human adaptive behavior in volatile environments , 2019, bioRxiv.

[47]  Gilles Faÿ,et al.  Características inmunológicas claves en la fisiopatología de la sepsis. Infectio , 2009 .

[48]  P. Rudebeck,et al.  The neural basis of reversal learning: An updated perspective , 2017, Neuroscience.

[49]  Timothy E. J. Behrens,et al.  Review Frontal Cortex and Reward-guided Learning and Decision-making Figure 1. Frontal Brain Regions in the Macaque Involved in Reward-guided Learning and Decision-making Finer Grained Anatomical Divisions with Frontal Cortical Systems for Reward-guided Behavior , 2022 .

[50]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[52]  M. Gluck,et al.  Interactive memory systems in the human brain , 2001, Nature.

[53]  Jane X. Wang,et al.  Reinforcement Learning, Fast and Slow , 2019, Trends in Cognitive Sciences.

[54]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[55]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[56]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[57]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[58]  V. Wyart,et al.  Computational noise in reward-guided learning drives behavioral variability in volatile environments , 2018, Nature Neuroscience.

[59]  A. Tate A measure of intelligence , 2012 .

[60]  Marcin Andrychowicz,et al.  Parameter Space Noise for Exploration , 2017, ICLR.

[61]  A. Pouget,et al.  Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.