Rapid decision threshold modulation by reward rate in a neural network

Optimal performance in two-alternative, free response decision-making tasks can be achieved by the drift-diffusion model of decision making--which can be implemented in a neural network--as long as the threshold parameter of that model can be adapted to different task conditions. Evidence exists that people seek to maximize reward in such tasks by modulating response thresholds. However, few models have been proposed for threshold adaptation, and none have been implemented using neurally plausible mechanisms. Here we propose a neural network that adapts thresholds in order to maximize reward rate. The model makes predictions regarding optimal performance and provides a benchmark against which actual performance can be compared, as well as testable predictions about the way in which reward rate may be encoded by neural mechanisms.

[1]  R. Duncan Luce,et al.  Response Times: Their Role in Inferring Elementary Mental Organization , 1986 .

[2]  Busemeyer,et al.  An adaptive approach to human decision-making , 1988 .

[3]  J. Wolfowitz,et al.  Optimum Character of the Sequential Probability Ratio Test , 1948 .

[4]  E Harth,et al.  Alopex: a stochastic method for determining visual receptive fields. , 1974, Vision research.

[5]  E. Miller,et al.  An integrative theory of prefrontal cortex function. , 2001, Annual review of neuroscience.

[6]  Corey J. Bohil,et al.  Base-rate and payoff effects in multidimensional perceptual categorization. , 1998, Journal of Experimental Psychology. Learning, Memory and Cognition.

[7]  W. Freeman Nonlinear gain mediating cortical stimulus-response relations , 1979, Biological Cybernetics.

[8]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[9]  J. Townsend,et al.  Decision field theory: a dynamic-cognitive approach to decision making in an uncertain environment. , 1993, Psychological review.

[10]  M. Shadlen,et al.  Effect of Expected Reward Magnitude on the Response of Neurons in the Dorsolateral Prefrontal Cortex of the Macaque , 1999, Neuron.

[11]  Jerome R. Busemeyer,et al.  An adaptive approach to human decision making: Learning theory, decision theory, and human performance. , 1992 .

[12]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[13]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[14]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[15]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[16]  E. Bullmore,et al.  Society for Neuroscience Abstracts , 1997 .

[17]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[18]  P. Holmes,et al.  The dynamics of choice among multiple alternatives , 2006 .

[19]  Philip Holmes,et al.  Simple Neural Networks that Optimize Decisions , 2005, Int. J. Bifurc. Chaos.

[20]  C. Gardiner Handbook of Stochastic Methods , 1983 .

[21]  J. Gold,et al.  Banburismus and the Brain Decoding the Relationship between Sensory Stimuli, Decisions, and Reward , 2002, Neuron.

[22]  Donald Laming,et al.  Information theory of choice-reaction times , 1968 .

[23]  J. Schall,et al.  Neural Control of Voluntary Movement Initiation , 1996, Science.

[24]  Jerome R. Busemeyer,et al.  Criterion Learning in a Deferred Decision-Making Task , 1989 .

[25]  T. Hughes,et al.  Signals and systems , 2006, Genome Biology.

[26]  James L. McClelland,et al.  The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[27]  H. Sebastian Seung,et al.  The Autapse: A Simple Illustration of Short-Term Analog Memory Storage by Tuned Synaptic Feedback , 2004, Journal of Computational Neuroscience.

[28]  P. Holmes,et al.  Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields , 1983, Applied Mathematical Sciences.

[29]  Jeffrey N. Rouder,et al.  Modeling Response Times for Two-Choice Decisions , 1998 .

[30]  Kenji Doya,et al.  Near-Saddle-Node Bifurcation Behavior as Dynamics in Working Memory for Goal-Directed Behavior , 1998, Neural Computation.

[31]  J. Gold,et al.  Representation of a perceptual decision in developing oculomotor commands , 2000, Nature.

[32]  Jonathan D. Cohen,et al.  The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks. , 2006, Psychological review.

[33]  S. Grossberg,et al.  Neural dynamics of decision making under risk: affective balance and cognitive-emotional interactions. , 1988, Psychological review.

[34]  M. Shadlen,et al.  The effect of stimulus strength on the speed and accuracy of a perceptual decision. , 2005, Journal of vision.

[35]  Michael J. Frank,et al.  Hold your horses: A dynamic computational role for the subthalamic nucleus in decision making , 2006, Neural Networks.

[36]  W S McCulloch,et al.  A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.

[37]  R. Romo,et al.  Timing and neural encoding of somatosensory parametric working memory in macaque prefrontal cortex. , 2003, Cerebral cortex.

[38]  M. Shadlen,et al.  Response of Neurons in the Lateral Intraparietal Area during a Combined Visual Discrimination Reaction Time Task , 2002, The Journal of Neuroscience.

[39]  J. Gold,et al.  Neural computations that underlie decisions about sensory stimuli , 2001, Trends in Cognitive Sciences.

[40]  Jonathan D. Cohen,et al.  An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. , 2005, Annual review of neuroscience.

[41]  R. Ratcliff,et al.  Connectionist and diffusion models of reaction time. , 1999, Psychological review.

[42]  M. Stone Models for choice-reaction time , 1960 .

[43]  D. Jordan,et al.  Nonlinear Ordinary Differential Equations: An Introduction for Scientists and Engineers , 1979 .

[44]  Stephen Grossberg,et al.  Absolute stability of global pattern formation and parallel memory storage by competitive neural networks , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[45]  Philip L. Smith,et al.  Psychology and neurobiology of simple decisions , 2004, Trends in Neurosciences.

[46]  Michael C. Mozer,et al.  A Rational Analysis of Cognitive Control in a Speeded Discrimination Task , 2001, NIPS.

[47]  James L. McClelland,et al.  On the control of automatic processes: a parallel distributed processing account of the Stroop effect. , 1990, Psychological review.

[48]  Patrick Simen,et al.  Neural mechanisms for control in complex cognition , 2004 .

[49]  Ido Erev Signal detection by human observers: a cutoff reinforcement learning model of categorization decisions under uncertainty. , 1998 .

[50]  E. Rolls The orbitofrontal cortex and reward. , 2000, Cerebral cortex.

[51]  Michael L. Platt,et al.  Neural correlates of decision variables in parietal cortex , 1999, Nature.

[52]  D. Jordan,et al.  Nonlinear ordinary differential equations (2nd ed.) , 1987 .

[53]  Stephen Grossberg,et al.  Competitive Learning: From Interactive Activation to Adaptive Resonance , 1987, Cogn. Sci..

[54]  Xiao-Jing Wang,et al.  Probabilistic Decision Making by Slow Reverberation in Cortical Circuits , 2002, Neuron.

[55]  W. Newsome,et al.  Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. , 2001, Journal of neurophysiology.

[56]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[57]  I. Erev,et al.  On adaptation, maximization, and reinforcement learning among cognitive strategies. , 2005, Psychological review.

[58]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.

[59]  Philip Holmes,et al.  Optimal Decisions: From Neural Spikes, through Stochastic Differential Equations, to Behavior , 2005, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[60]  Kevin N. Gurney,et al.  The Basal Ganglia and Cortex Implement Optimal Decision Making Between Alternative Actions , 2007, Neural Computation.

[61]  Roger Ratcliff,et al.  A Theory of Memory Retrieval. , 1978 .

[62]  Kenji Doya,et al.  Dynamics of Attention as Near Saddle-Node Bifurcation Behavior , 1995, NIPS.

[63]  S. Grossberg A psychophysiological theory of reinforcement, drive, motivation, and attention , 1987 .

[64]  M. Botvinick,et al.  Conflict monitoring and cognitive control. , 2001, Psychological review.