Mice adaptively generate choice variability in a deterministic task

Can decisions be made solely by chance? To investigate this question, we designed a deterministic setting in which mice are rewarded for non-repetitive choice sequences, and modeled the experiment using reinforcement learning. We found that mice progressively increased their choice variability using a memory-free, pseudo-random selection, rather than by learning complex sequences. Our results demonstrate that a decision-making process can self-generate variability and randomness even when the rules governing reward delivery are neither stochastic nor volatile.

[1]  Philippe Faure,et al.  Mice adaptively generate choice variability in a deterministic task , 2020, Communications Biology.

[2]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[3]  R. Kessels,et al.  Assessing executive functioning: On the validity, reliability, and sensitivity of a click/point random number generation task in healthy adults and patients with cognitive decline , 2011, Journal of clinical and experimental neuropsychology.

[4]  A. Rapoport,et al.  Generation of random series in two-person strictly competitive games , 1992 .

[5]  P. Glimcher Indeterminacy in brain and behavior. , 2005, Annual review of psychology.

[6]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[7]  Nathaniel D. Daw,et al.  Increased locus coeruleus tonic activity causes disengagement from a patch-foraging task , 2017, Cognitive, Affective, & Behavioral Neuroscience.

[8]  Yohsuke R. Miyamoto,et al.  Temporal structure of motor variability is dynamically regulated and predicts motor learning ability , 2014, Nature Neuroscience.

[9]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[10]  R. Rescorla,et al.  A theory of Pavlovian conditioning : Variations in the effectiveness of reinforcement and nonreinforcement , 1972 .

[11]  Jürgen Kurths,et al.  Recurrence plots for the analysis of complex systems , 2009 .

[12]  Annick Lesne,et al.  Recurrence Plots for Symbolic Sequences , 2010, Int. J. Bifurc. Chaos.

[13]  Abraham Lempel,et al.  On the Complexity of Finite Sequences , 1976, IEEE Trans. Inf. Theory.

[14]  Alicia Grunow,et al.  Learning to vary and varying to learn , 2002, Psychonomic bulletin & review.

[15]  V. Wyart,et al.  Computational noise in reward-guided learning drives behavioral variability in volatile environments , 2018, Nature Neuroscience.

[16]  W. A. Wagenaar Generation of random sequences by human subjects: A critical survey of literature. , 1972 .

[17]  Mattias P. Karlsson,et al.  Network Resets in Medial Prefrontal Cortex Mark the Onset of Behavioral Uncertainty , 2012, Science.

[18]  Jonathan D. Cohen,et al.  Humans use directed and random exploration to solve the explore-exploit dilemma. , 2014, Journal of experimental psychology. General.

[19]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[20]  D. Barraclough,et al.  Reinforcement learning and decision making in monkeys during a competitive game. , 2004, Brain research. Cognitive brain research.

[21]  Fred Hasselman,et al.  A Time Series Approach to Random Number Generation: Using Recurrence Quantification Analysis to Capture Executive Behavior , 2015, Front. Hum. Neurosci..

[22]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[23]  M. Dresselhaus,et al.  A Specialized Forebrain Circuit for Vocal Babbling in the Juvenile Songbird , 2008 .

[24]  Mehdi Khamassi,et al.  Dopamine regulates the exploration-exploitation trade-off in rats , 2018, bioRxiv.

[25]  Angela J. Yu,et al.  Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[26]  Rajesh P. N. Rao,et al.  Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes , 2010, Front. Comput. Neurosci..

[27]  E. Koechlin,et al.  Managing competing goals — a key role for the frontopolar cortex , 2017, Nature Reviews Neuroscience.

[28]  W. Schultz Getting Formal with Dopamine and Reward , 2002, Neuron.

[29]  John N. Towse,et al.  Random number generation and working memory , 2007 .

[30]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[31]  P. Faure,et al.  Nicotinic receptors in the ventral tegmental area promote uncertainty-seeking , 2016, Nature Neuroscience.

[32]  K. Branson,et al.  Behavioral Variability through Stochastic Choice and Its Gating by Anterior Cingulate Cortex , 2014, Cell.

[33]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.