Optimal policy for multi-alternative decisions

Everyday decisions frequently require choosing among multiple alternatives. Yet the optimal policy for such decisions is unknown. Here we derive the normative policy for general multi-alternative decisions. This strategy requires evidence accumulation to nonlinear, time-dependent bounds that trigger choices. A geometric symmetry in those boundaries allows the optimal strategy to be implemented by a simple neural circuit involving normalization with fixed decision bounds and an urgency signal. The model captures several key features of the response of decision-making neurons as well as the increase in reaction time as a function of the number of alternatives, known as Hick’s law. In addition, we show that in the presence of divisive normalization and internal variability, our model can account for several so-called ‘irrational’ behaviors, such as the similarity effect as well as the violation of both the independence of irrelevant alternatives principle and the regularity principle. Everyday decisions require choosing among multiple options. This work derives the optimal decision policy and shows how it can be approximated by a biologically plausible neural circuit and how this circuit can reproduce observed behavior.

[1]  Paul Cisek,et al.  Neural Correlates of Biased Competition in Premotor Cortex , 2011, The Journal of Neuroscience.

[2]  W. E. Hick Quarterly Journal of Experimental Psychology , 1948, Nature.

[3]  Michael L. Platt,et al.  Neural correlates of decision variables in parietal cortex , 1999, Nature.

[4]  S. Shafir,et al.  Context-dependent violations of rational choice in honeybees (Apis mellifera) and gray jays (Perisoreus canadensis) , 2001, Behavioral Ecology and Sociobiology.

[5]  W. Martin Usrey,et al.  Patterned Activity within the Local Cortical Architecture , 2010, Front. Neurosci..

[6]  Nick Chater,et al.  Economic irrationality is optimal during noisy decision making , 2016, Proceedings of the National Academy of Sciences.

[7]  Joseph B. Kadane,et al.  A Gridding Method for Bayesian Sequential Decision Problems , 2003 .

[8]  P. Holmes,et al.  The dynamics of choice among multiple alternatives , 2006 .

[9]  Sridhar Mahadevan,et al.  Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.

[10]  M. Shadlen,et al.  Decision Making and Sequential Sampling from Memory , 2016, Neuron.

[11]  James L. McClelland,et al.  The time course of perceptual choice: the leaky, competing accumulator model. , 2001, Psychological review.

[12]  P. Samuelson,et al.  Foundations of Economic Analysis. , 1948 .

[13]  C. Summerfield,et al.  Gain control explains the effect of distraction in human perceptual, cognitive, and economic decision making , 2018, Proceedings of the National Academy of Sciences.

[14]  Robert M McPeek,et al.  Neural Discharge in the Superior Colliculus during Target Search Paradigms , 2002, Annals of the New York Academy of Sciences.

[15]  R. Ratcliff,et al.  Multialternative decision field theory: a dynamic connectionist model of decision making. , 2001, Psychological review.

[16]  Lawrence H Snyder,et al.  An anti-Hick's effect in monkey and human saccade reaction times. , 2008, Journal of vision.

[17]  Kevin N. Gurney,et al.  The Basal Ganglia and Cortex Implement Optimal Decision Making Between Alternative Actions , 2007, Neural Computation.

[18]  I. Simonson,et al.  Choice Based on Reasons: The Case of Attraction and Compromise Effects , 1989 .

[19]  A. Rangel,et al.  Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in value-based decisions , 2011, Proceedings of the National Academy of Sciences.

[20]  Christopher P. Puto,et al.  Adding Asymmetrically Dominated Alternatives: Violations of Regularity & the Similarity Hypothesis. , 1981 .

[21]  R. Duncan Luce,et al.  Individual Choice Behavior: A Theoretical Analysis , 1979 .

[22]  P. Cisek,et al.  Modulation of Premotor and Primary Motor Cortical Activity during Volitional Adjustments of Speed-Accuracy Trade-Offs , 2016, The Journal of Neuroscience.

[23]  Xiao-Jing Wang Decision Making in Recurrent Neuronal Circuits , 2008, Neuron.

[24]  R. Luce,et al.  Individual Choice Behavior: A Theoretical Analysis. , 1960 .

[25]  Alexandre Pouget,et al.  Optimal policy for value-based decision-making , 2016, Nature Communications.

[26]  A. Treisman,et al.  Search asymmetry: a diagnostic for preattentive processing of separable features. , 1985, Journal of experimental psychology. General.

[27]  Andrew Heathcote,et al.  The multiattribute linear ballistic accumulator model of context effects in multialternative choice. , 2014, Psychological review.

[28]  Marius Usher,et al.  Extending a biologically inspired model of choice: multi-alternatives, nonlinearity and value-based multidimensional choice , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[29]  P. Glimcher,et al.  Dynamic Divisive Normalization Predicts Time-Varying Value Coding in Decision-Related Circuits , 2014, The Journal of Neuroscience.

[30]  M. Shadlen,et al.  Decision-making with multiple alternatives , 2008, Nature Neuroscience.

[31]  Dhanistha Panyasak,et al.  Circuits , 1995, Annals of the New York Academy of Sciences.

[32]  P. Cisek,et al.  Deliberation and Commitment in the Premotor and Primary Motor Cortex during Dynamic Decision Making , 2014, Neuron.

[33]  Marius Usher,et al.  Disentangling decision models: from independence to competition. , 2013, Psychological review.

[34]  Richard L. Lewis,et al.  Why Contextual Preference Reversals Maximize Expected Value , 2016, Psychological review.

[35]  Leanne Boucher,et al.  Saccades operate in violation of Hick’s law , 2002, Experimental Brain Research.

[36]  Nick Chater,et al.  Salience driven value integration explains decision biases and preference reversal , 2012, Proceedings of the National Academy of Sciences.

[37]  Alexandre Pouget,et al.  Optimal decision bounds for probabilistic population codes and time varying evidence , 2011, Nature Precedings.

[38]  Andrew Heathcote,et al.  A ballistic model of choice response time. , 2005, Psychological review.

[39]  Adam Brandenburger,et al.  Choice-theoretic foundations of the divisive normalization model. , 2019, Journal of economic behavior & organization.

[40]  A. Pouget,et al.  The Cost of Accumulating Evidence in Perceptual Decision Making , 2012, The Journal of Neuroscience.

[41]  Mel W. Khaw,et al.  Normalization is a general neural mechanism for context-dependent decision making , 2013, Proceedings of the National Academy of Sciences.

[42]  Venugopal V. Veeravalli,et al.  Multihypothesis sequential probability ratio tests - Part II: Accurate asymptotic expansions for the expected sample size , 2000, IEEE Trans. Inf. Theory.

[43]  Luigi Acerbi,et al.  Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search , 2017, NIPS.

[44]  J. Gold,et al.  The neural basis of decision making. , 2007, Annual review of neuroscience.

[45]  A. Tversky Elimination by aspects: A theory of choice. , 1972 .

[46]  Xiao-Jing Wang,et al.  Similarity Effect and Optimal Control of Multiple-Choice Decision Making , 2008, Neuron.

[47]  Jochen Ditterich,et al.  A Comparison between Mechanisms of Multi-Alternative Perceptual Decision Making: Ability to Explain Human Behavior, Predictions for Neurophysiology, and Relationship with Decision Theory , 2010, Front. Neurosci..

[48]  P. Glimcher,et al.  Reward Value-Based Gain Control: Divisive Normalization in Parietal Cortex , 2011, The Journal of Neuroscience.

[49]  R. Hyman Stimulus information as a determinant of reaction time. , 1953, Journal of experimental psychology.

[50]  A. Tversky,et al.  Context-dependent preferences , 1993 .

[51]  Venugopal V. Veeravalli,et al.  A sequential procedure for multihypothesis testing , 1994, IEEE Trans. Inf. Theory.

[52]  Alexandre Pouget,et al.  The impact of learning on perceptual decisions and its implication for speed-accuracy tradeoffs , 2018, bioRxiv.

[53]  J. Pettibone Testing the effect of time pressure on asymmetric dominance and compromise decoys in choice , 2012 .

[54]  R. Bellman Dynamic programming. , 1957, Science.

[55]  Jochen Ditterich,et al.  New advances in understanding decisions among multiple alternatives , 2012, Current Opinion in Neurobiology.

[56]  Larissa Albantakis,et al.  The encoding of alternatives in multiple-choice decision-making , 2009, Proceedings of the National Academy of Sciences.

[57]  M. Carandini,et al.  Normalization as a canonical neural computation , 2011, Nature Reviews Neuroscience.

[58]  R. H. S. Carpenter,et al.  Neural computation of log likelihood in control of saccadic eye movements , 1995, Nature.

[59]  Jörg Rieskamp,et al.  Value-based attentional capture affects multi-alternative decision making , 2018, eLife.

[60]  A. Pouget,et al.  Not Noisy, Just Wrong: The Role of Suboptimal Inference in Behavioral Variability , 2012, Neuron.