Bayesian Behavioral Systems Theory

Behavioral Systems Theory suggests that observable behavior is embedded in a hierarchy. A CS elicits behavior because, after learning, it activates a pathway through this hierarchy. Much of Timberlake's body of work on Behavioral Systems Theory focuses on the conditions that support the conditioning of these pathways. Most notably, his work shows that the identity of the CS, US, and the CS-US interval all help support conditioning of the system. Here, we use recent experiments in the interval timing literature to motivate a Bayesian implementation of Behavioral Systems Theory. There is a probability distribution over possible pathways through the hierarchy, and the one that maximizes reinforcement is elicited. This probability distribution is conditioned on background information, like the CS-US interval and the animal's motivational state. Lower level actions of the hierarchy, like tracking prey, are conditioned on higher level goals, like the general search for food. Our implementation of Behavioral Systems Theory captures the essential features of Timberlake's verbal model; it acts as a glue, integrating sensory, timing, and decision mechanisms with observed behavior.

[1]  H. B. Barlow,et al.  Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[2]  D. Knill,et al.  The Bayesian brain: the role of uncertainty in neural coding and computation , 2004, Trends in Neurosciences.

[3]  C. Gallistel,et al.  Risk assessment in man and mouse , 2009, Proceedings of the National Academy of Sciences.

[4]  Ryan D Ward,et al.  It's the information! , 2013, Behavioural Processes.

[5]  R. Rescorla A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[6]  Patrick Simen,et al.  Optimal response rates in humans and rats. , 2015, Journal of experimental psychology. Animal learning and cognition.

[7]  W. Timberlake,et al.  Stimulus and response contingencies in the misbehavior of rats. , 1982, Journal of experimental psychology. Animal behavior processes.

[8]  Eero P. Simoncelli,et al.  Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics , 2011, Nature Neuroscience.

[9]  R. Church,et al.  A modular theory of learning and performance , 2007, Psychonomic bulletin & review.

[10]  R M Church,et al.  Scalar Timing in Memory , 1984, Annals of the New York Academy of Sciences.

[11]  W. Timberlake,et al.  Auto-Shaping in Rats to the Presentation of Another Rat Predicting Food , 1975, Science.

[12]  C. Gallistel,et al.  Time to rethink the neural mechanisms of learning and memory , 2014, Neurobiology of Learning and Memory.

[13]  Wei Ji Ma,et al.  Bayesian inference with probabilistic population codes , 2006, Nature Neuroscience.

[14]  Joseph T. McGuire,et al.  A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.

[15]  R. Church,et al.  Application of scalar timing theory to individual trials. , 1994, Journal of experimental psychology. Animal behavior processes.

[16]  P. Dayan,et al.  Opinion TRENDS in Cognitive Sciences Vol.10 No.8 Full text provided by www.sciencedirect.com A normative perspective on motivation , 2022 .

[17]  Michael N. Shadlen,et al.  Temporal context calibrates interval timing , 2010, Nature Neuroscience.

[18]  John H. Wearden,et al.  Changing Sensitivity to Duration in Human Scalar Timing: An Experiment, a Review, and Some Possible Explanations , 1997 .

[19]  Carlos Diuk,et al.  Hierarchical Learning Induces Two Simultaneous, But Separable, Prediction Errors in Human Basal Ganglia , 2013, The Journal of Neuroscience.

[20]  Within-session modulation of timed anticipatory responding: When to start responding , 2010, Behavioural Processes.

[21]  Jonathan D. Cohen,et al.  Optimal Temporal Risk Assessment , 2011, Front. Integr. Neurosci..

[22]  W. E. Hick Quarterly Journal of Experimental Psychology , 1948, Nature.

[23]  Tuğçe Tosun,et al.  Mice plan decision strategies based on previously learned time intervals, locations, and probabilities , 2016, Proceedings of the National Academy of Sciences.

[24]  Stephen G Lisberger,et al.  Neural implementation of Bayesian inference in a sensory-motor behavior , 2018, Nature Neuroscience.

[25]  Alec Solway,et al.  Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..

[26]  John Garcia,et al.  Relation of cue to consequence in avoidance learning , 1966 .

[27]  C R Gallistel,et al.  Finding numbers in the brain , 2018, Philosophical Transactions of the Royal Society B: Biological Sciences.

[28]  Russell M Church,et al.  Optimal timing , 2016, Current Opinion in Behavioral Sciences.

[29]  W. Timberlake Rats’ responses to a moving object related to food or water: A behavior-systems analysis , 1983 .

[30]  Kathleen M. Silva,et al.  The organization and temporal properties of appetitive behavior in rats , 1998 .

[31]  M. Botvinick Hierarchical reinforcement learning and decision making , 2012, Current Opinion in Neurobiology.

[32]  Bilgehan Çavdaroğlu,et al.  Time-based reward maximization , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.

[33]  C. Gallistel,et al.  Temporal maps and informativeness in associative learning , 2009, Trends in Neurosciences.

[34]  C. Gallistel,et al.  Mice take calculated risks , 2012, Proceedings of the National Academy of Sciences.

[35]  Marcia L. Spetch,et al.  Reward magnitude and timing in pigeons , 2011, Behavioural Processes.

[37]  A. Pouget,et al.  Probabilistic brains: knowns and unknowns , 2013, Nature Neuroscience.

[38]  P. Dayan,et al.  Tonic dopamine: opportunity costs and the control of response vigor , 2007, Psychopharmacology.

[39]  M. Botvinick,et al.  Neural representations of events arise from temporal community structure , 2013, Nature Neuroscience.

[40]  The importance of the reinforcer as a time marker , 2010, Behavioural Processes.

[41]  Chrystopher L. Nehaniv,et al.  Hierarchical Behaviours: Getting the Most Bang for Your Bit , 2009, ECAL.

[42]  Eero P. Simoncelli,et al.  Origin and Function of Tuning Diversity in Macaque Visual Cortex , 2015, Neuron.

[43]  G. Pezzulo,et al.  An information-theoretic perspective on the costs of cognition , 2018, Neuropsychologia.

[44]  Patrick Simen,et al.  A decision model of timing , 2016, Current Opinion in Behavioral Sciences.

[45]  H. Rubenstein,et al.  Test of Response Bias Explanation of Word-Frequency Effect , 1961, Science.

[46]  C. Gallistel,et al.  Interval timing in genetically modified mice: a simple paradigm , 2008, Genes, brain, and behavior.

[47]  Abraham Wald,et al.  Statistical Decision Functions , 1951 .

[48]  K. Breland,et al.  The misbehavior of organisms. , 1961 .

[49]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[50]  B. Campbell,et al.  The role of experience in the spontaneous activity of hungry rats. , 1954, Journal of comparative and physiological psychology.

[51]  Fuat Balci,et al.  Motivational effects on interval timing in dopamine transporter (DAT) knockdown mice , 2010, Brain Research.

[52]  J. Horvitz,et al.  Amphetamine affects the start of responding in the peak interval timing task , 2007, Behavioural Processes.

[53]  P. Dayan,et al.  Model-based influences on humans’ choices and striatal prediction errors , 2011, Neuron.

[54]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[55]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[56]  Daniel Polani,et al.  Grounding subgoals in information transitions , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[57]  Kathleen M. Silva,et al.  A Behavior Systems View of Conditioned States during Long and Short CS-US Intervals , 1997 .

[58]  F. Balcı Interval Timing, Dopamine, and Motivation , 2014 .

[59]  C R Gallistel,et al.  Theoretical implications of quantitative properties of interval timing and probability estimation in mouse and rat. , 2017, Journal of the experimental analysis of behavior.

[60]  Jonathan D. Cohen,et al.  A Model of Interval Timing by Neural Integration , 2011, The Journal of Neuroscience.

[61]  E. Kyonka,et al.  Choice and timing in pigeons under differing levels of food deprivation , 2014, Behavioural Processes.

[62]  F. Balcı,et al.  Probabilistic Information Modulates the Timed Response Inhibition Deficit in Aging Mice , 2019, Front. Behav. Neurosci..

[63]  Andrew T. Marshall,et al.  Motivation and timing: Clues for modeling the reward system , 2012, Behavioural Processes.

[64]  S S Stevens,et al.  To Honor Fechner and Repeal His Law: A power function, not a log function, describes the operating characteristic of a sensory system. , 1961, Science.

[65]  M. A. Girshick,et al.  Theory of games and statistical decisions , 1955 .

[66]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[67]  Peter Sterling,et al.  Principles of Neural Design , 2015 .

[68]  Daniel A. Gottlieb Is the number of trials a primary determinant of conditioned responding? , 2008, Journal of experimental psychology. Animal behavior processes.

[69]  C. Sims Rate–distortion theory and human perception , 2016, Cognition.

[70]  R. Rescorla,et al.  Within-subject effects of number of trials in rat conditioning procedures. , 2010, Journal of experimental psychology. Animal behavior processes.

[71]  P. Killeen,et al.  A behavioral theory of timing. , 1988, Psychological review.

[72]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[73]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[74]  F. Balcı,et al.  Sex differences in the timing behavior performance of 3xTg-AD and wild-type mice in the peak interval procedure , 2019, Behavioural Brain Research.

[75]  Warren H. Meck,et al.  Bayesian optimization of time perception , 2013, Trends in Cognitive Sciences.

[76]  A. Machado Learning the temporal dynamics of behavior. , 1997, Psychological review.

[77]  Peter Dayan,et al.  How fast to work: Response vigor, motivation and tonic dopamine , 2005, NIPS.

[78]  Mauro Barni,et al.  An Information Theoretic Perspective , 2004 .

[79]  Wei Ji Ma,et al.  Probabilistic brains: knowns and , 2013 .

[80]  William Timberlake,et al.  Motivational modes in behavior systems , 2000 .

[81]  Self-Concept Variables Sex Differences in , 2016 .

[82]  Sarah Marzen,et al.  The evolution of lossy compression , 2015, Journal of The Royal Society Interface.

[83]  F. Attneave Some informational aspects of visual perception. , 1954, Psychological review.

[84]  J. T. Erichsen,et al.  Optimal prey selection in the great tit (Parus major) , 1977, Animal Behaviour.

[85]  J. Gibbon Scalar expectancy theory and Weber's law in animal timing. , 1977 .

[86]  H. M. Jenkins,et al.  The form of the auto-shaped response with food or water reinforcers. , 1973, Journal of the experimental analysis of behavior.

[87]  Ning Qian,et al.  Neuronal Firing Rate As Code Length: a Hypothesis , 2020, Computational Brain & Behavior.

[88]  L. Sackney,et al.  Contemporary Learning Theories, Instructional Design and Leadership , 2007 .

[89]  Patrick Simen,et al.  Decision processes in temporal discrimination. , 2014, Acta psychologica.