JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 581–617 NUMBER 3(NOVEMBER) LINEAR-NONLINEAR-POISSON MODELS OF PRIMATE CHOICE DYNAMICS

The equilibrium phenomenon of matching behavior traditionally has been studied in stationary environments. Here we attempt to uncover the local mechanism of choice that gives rise to matching by studying behavior in a highly dynamic foraging environment. In our experiments, 2 rhesus monkeys (Macacca mulatta) foraged for juice rewards by making eye movements to one of two colored icons presented on a computer monitor, each rewarded on dynamic variable-interval schedules. Using a generalization of Wiener kernel analysis, we recover a compact mechanistic description of the impact of past reward on future choice in the form of a Linear-Nonlinear-Poisson model. We validate this model through rigorous predictive and generative testing. Compared to our earlier work with this same data set, this model proves to be a better description of choice behavior and is more tightly correlated with putative neural value signals. Refinements over previous models include hyperbolic (as opposed to exponential) temporal discounting of past rewards, and differential (as opposed to fractional) comparisons of option value. Through numerical simulation we find that within this class of strategies, the model parameters employed by animals are very close to those that maximize reward harvesting efficiency.

[1]  R. H. Strotz Myopia and Inconsistency in Dynamic Utility Maximization , 1955 .

[2]  R J HERRNSTEIN,et al.  Relative and absolute strength of response as a function of frequency of reinforcement. , 1961, Journal of the experimental analysis of behavior.

[3]  D. M. Green,et al.  Signal detection theory and psychophysics , 1966 .

[4]  R. Herrnstein,et al.  Choice and delay of reinforcement. , 1967, Journal of the experimental analysis of behavior.

[5]  E. Evarts A technique for recording activity of subcortical neurons in moving animals. , 1968, Electroencephalography and clinical neurophysiology.

[6]  R. Herrnstein On the law of effect. , 1970, Journal of the experimental analysis of behavior.

[7]  W M Baum,et al.  On two types of deviation from the matching law: bias and undermatching. , 1974, Journal of the experimental analysis of behavior.

[8]  S S Pliskoff,et al.  Concurrent schedules: a quantitative relation between changeover behavior and its consequences. , 1977, Journal of the experimental analysis of behavior.

[9]  W. Baum,et al.  Matching, undermatching, and overmatching in studies of choice. , 1979, Journal of the experimental analysis of behavior.

[10]  G M Heyman,et al.  A Markov model description of changeover probabilities on concurrent variable-interval schedules. , 1979, Journal of the experimental analysis of behavior.

[11]  J J McDowell,et al.  A multivariate rate equation for variable-interval performance. , 1979, Journal of the experimental analysis of behavior.

[12]  B. Richmond,et al.  Implantation of magnetic search coils for measurement of eye position: An improved method , 1980, Vision Research.

[13]  J J McDowell,et al.  An analytic comparison of Herrnstein's equations and a multivariate rate equation. , 1980, Journal of the experimental analysis of behavior.

[14]  R. Herrnstein,et al.  CHAPTER 5 – Melioration and Behavioral Allocation1 , 1980 .

[15]  W M Baum,et al.  Optimization and the matching law as accounts of instrumental behavior. , 1981, Journal of the experimental analysis of behavior.

[16]  J. Kagel,et al.  Maximization theory in behavioral psychology , 1981, Behavioral and Brain Sciences.

[17]  W Vaughan,et al.  Melioration, matching, and maximization. , 1981, Journal of the experimental analysis of behavior.

[18]  A I Houston,et al.  How to maximize reward rate on two variable-interval paradigms. , 1981, Journal of the experimental analysis of behavior.

[19]  J. Staddon,et al.  Limits to action, the allocation of individual behavior , 1982 .

[20]  Lance M. Optican,et al.  Unix-based multiple-process system, for real-time data acquisition and control , 1982 .

[21]  Variable-interval rate equations and reinforcement and response distributions , 1983 .

[22]  M Davison,et al.  Determination of a behavioral transfer function: White-noise analysis of session-to-session response-ratio dynamics on concurrent VI VI schedules. , 1985, Journal of the experimental analysis of behavior.

[23]  G. Loewenstein Anticipation and the Valuation of Delayed Consumption , 1987 .

[24]  J. E. Mazur An adjusting procedure for studying delayed reinforcement. , 1987 .

[25]  M. Commons The effect of delay and of intervening events on reinforcement value , 2013 .

[26]  M. Davison Choice, changeover, and travel: A quantitative model. , 1991, Journal of the experimental analysis of behavior.

[27]  R. Herrnstein,et al.  Utility maximization and melioration: Internalities in individual choice , 1993 .

[28]  L. Green,et al.  Short-term and long-term effects of reinforcers on choice. , 1993, Journal of the experimental analysis of behavior.

[29]  P. Killeen Mathematical principles of reinforcement , 1994 .

[30]  T. A. Mark,et al.  Kinetics of matching. , 1994, Journal of experimental psychology. Animal behavior processes.

[31]  K. Kirby,et al.  Modeling Myopic Decisions: Evidence for Hyperbolic Delay-Discounting within Subjects and Amounts , 1995 .

[32]  R. Kessel,et al.  Investigating Behavioral Dynamics With A Fixed-time Extinction Schedule And Linear Analysis. , 1996, Journal of the experimental analysis of behavior.

[33]  J. M. Horner,et al.  Integration of reinforcement effects over time , 1997 .

[34]  L. Green,et al.  Rate of temporal discounting decreases with amount of reward , 1997, Memory & cognition.

[35]  T Shahan,et al.  On the functions of the changeover delay. , 1998, Journal of the experimental analysis of behavior.

[36]  W Baum,et al.  Optimality And Concurrent Variable-interval Variable-ratio Schedules. , 1999, Journal of the experimental analysis of behavior.

[37]  M. Davison,et al.  Choice in a variable environment: every reinforcer counts. , 2000, Journal of the experimental analysis of behavior.

[38]  C. Gallistel,et al.  Time, rate, and conditioning. , 2000, Psychological review.

[39]  E J Chichilnisky,et al.  A simple white noise analysis of neuronal light responses , 2001, Network.

[40]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[41]  C. Gallistel,et al.  The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. , 2001, Journal of experimental psychology. Animal behavior processes.

[42]  Steven Kay,et al.  Fundamentals Of Statistical Signal Processing , 2001 .

[43]  C. Gallistel,et al.  The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. , 2001, Journal of experimental psychology. Animal behavior processes.

[44]  Linear modeling of steady-state behavioral dynamics. , 2002, Journal of the experimental analysis of behavior.

[45]  P. Glimcher,et al.  Activity in Posterior Parietal Cortex Is Correlated with the Relative Subjective Desirability of Action , 2004, Neuron.

[46]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[47]  Samuel M. McClure,et al.  Separate Neural Systems Value Immediate and Delayed Monetary Rewards , 2004, Science.

[48]  M. Davison,et al.  Choice in a variable environment: visit patterns in the dynamics of choice. , 2004, Journal of the experimental analysis of behavior.

[49]  L. Green,et al.  A discounting framework for choice with delayed and probabilistic rewards. , 2004, Psychological bulletin.

[50]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[51]  Eero P. Simoncelli,et al.  To appear in: The New Cognitive Neurosciences, 3rd edition Editor: M. Gazzaniga. MIT Press, 2004. Characterization of Neural Responses with Stochastic Stimuli , 2022 .

[52]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 2005, IEEE Transactions on Neural Networks.

[53]  W. Newsome,et al.  Choosing the greater of two goods: neural currencies for valuation and decision making , 2005, Nature Reviews Neuroscience.

[54]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[55]  T. Albright,et al.  Blue-yellow signals are enhanced by spatiotemporal luminance contrast in macaque V1. , 2005, Journal of neurophysiology.

[56]  Shanefrederick,et al.  Time Discounting and Time Preference : A Critical Review , 2022 .