Trial-by-trial data analysis using computational models

In numerous and high-profile studies, researchers have recently begun to integrate computational models into the analysis of data from experiments on reward learning and decision making (Platt and Glimcher, 1999; O’Doherty et al., 2003; Sugrue et al., 2004; Barraclough et al., 2004; Samejima et al., 2005; Daw et al., 2006; Li et al., 2006; Frank et al., 2007; Tom et al., 2007; Kable and Glimcher, 2007; Lohrenz et al., 2007; Schonberg et al., 2007; Wittmann et al., 2008; Hare et al., 2008; Hampton et al., 2008; Plassmann et al., 2008). As these techniques are spreading rapidly, but have been developed and documented somewhat sporadically alongside the studies themselves, the present review aims to clarify the toolbox (see also O’Doherty et al., 2007). In particular, we discuss the rationale for these methods and the questions they are suited to address. We then offer a relatively practical tutorial about the basic statistical methods for their answer and how they can be applied to data analysis. The techniques are illustrated with fits of simple models to simulated datasets. Throughout, we flag interpretational and technical pitfalls of which we believe authors, reviewers, and readers should be aware. We focus on cataloging the particular, admittedly somewhat idiosyncratic, combination of techniques frequently used in this literature, but also on exposing these techniques as instances of a general set of tools that can be applied to analyze behavioral and neural data of many sorts. A number of other reviews (Daw and Doya, 2006; Dayan and Niv, 2008) have focused on the scientific conclusions that have been obtained with these methods, an issue we omit almost entirely here. There are also excellent books that cover statistical inference of this general sort with much greater generality, formal precision, and detail (MacKay, 2003; Gelman et al., 2004; Bishop, 2006; Gelman and Hill, 2007).

[1]  Peter Bossaerts,et al.  Neural correlates of mentalizing-related computations during strategic interactions in humans , 2008, Proceedings of the National Academy of Sciences.

[2]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[3]  N. Daw,et al.  Reinforcement Learning Signals in the Human Striatum Distinguish Learners from Nonlearners during Reward-Based Decision Making , 2007, The Journal of Neuroscience.

[4]  P. Dayan,et al.  Cortical substrates for exploratory decisions in humans , 2006, Nature.

[5]  H. Akaike A new look at the statistical model identification , 1974 .

[6]  P. Dayan,et al.  Differential Encoding of Losses and Gains in the Human Striatum , 2007, The Journal of Neuroscience.

[7]  K. Doya,et al.  The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.

[8]  P. Glimcher,et al.  JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR 2005, 84, 555–579 NUMBER 3(NOVEMBER) DYNAMIC RESPONSE-BY-RESPONSE MODELS OF MATCHING BEHAVIOR IN RHESUS MONKEYS , 2022 .

[9]  Joseph Hilbe,et al.  Data Analysis Using Regression and Multilevel/Hierarchical Models , 2009 .

[10]  Colin Camerer,et al.  Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors , 2008, The Journal of Neuroscience.

[11]  R. Dolan,et al.  Subliminal Instrumental Conditioning Demonstrated in the Human Brain , 2008, Neuron.

[12]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[13]  Brian Knutson,et al.  FMRI Visualization of Brain Activity during a Monetary Incentive Delay Task , 2000, NeuroImage.

[14]  P. Glimcher,et al.  Midbrain Dopamine Neurons Encode a Quantitative Reward Prediction Error Signal , 2005, Neuron.

[15]  Karl J. Friston,et al.  Temporal Difference Models and Reward-Related Learning in the Human Brain , 2003, Neuron.

[16]  A. Barto Adaptive Critics and the Basal Ganglia , 1995 .

[17]  P. Glimcher,et al.  The neural correlates of subjective value during intertemporal choice , 2007, Nature Neuroscience.

[18]  Karl J. Friston,et al.  Bayesian model selection for group studies , 2009, NeuroImage.

[19]  S. Kakade,et al.  Acquisition and extinction in autoshaping. , 2002, Psychological review.

[20]  Kenji Doya,et al.  Estimating Internal Variables and Paramters of a Learning Agent by a Particle Filter , 2003, NIPS.

[21]  R. Hertwig,et al.  The priority heuristic: making choices without trade-offs. , 2006, Psychological review.

[22]  J. O'Doherty,et al.  Model‐Based fMRI and Its Application to Reward Learning and Decision Making , 2007, Annals of the New York Academy of Sciences.

[23]  Samuel M. McClure,et al.  Temporal Prediction Errors in a Passive Learning Task Activate Human Striatum , 2003, Neuron.

[24]  Karl J. Friston,et al.  Mixed-effects and fMRI studies , 2005, NeuroImage.

[25]  P. Dayan,et al.  Reinforcement learning: The Good, The Bad and The Ugly , 2008, Current Opinion in Neurobiology.

[26]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[27]  D. McFadden Conditional logit analysis of qualitative choice behavior , 1972 .

[28]  Karl J. Friston,et al.  Generalisability, Random Effects & Population Inference , 1998, NeuroImage.

[29]  Sabrina M. Tom,et al.  The Neural Basis of Loss Aversion in Decision-Making Under Risk , 2007, Science.

[30]  M. Hallett Human Brain Function , 1998, Trends in Neurosciences.

[31]  D. Barraclough,et al.  Prefrontal cortex and decision making in a mixed-strategy game , 2004, Nature Neuroscience.

[32]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[33]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[34]  K. Doya,et al.  Representation of Action-Specific Reward Values in the Striatum , 2005, Science.

[35]  Samuel M. McClure,et al.  Policy Adjustment in a Dynamic Economic Game , 2006, PloS one.

[36]  N. Daw,et al.  Striatal Activity Underlies Novelty-Based Choice in Humans , 2008, Neuron.

[37]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[38]  D. Freedman,et al.  On The So-Called “Huber Sandwich Estimator” and “Robust Standard Errors” , 2006 .

[39]  L. Nystrom,et al.  Tracking the hemodynamic responses to reward and punishment in the striatum. , 2000, Journal of neurophysiology.

[40]  Teck-Hua Ho,et al.  Experience-Weighted Attraction Learning in Coordination Games: Probability Rules, Heterogeneity, and Time-Variation. , 1998, Journal of mathematical psychology.

[41]  Michael L. Platt,et al.  Neural correlates of decision variables in parietal cortex , 1999, Nature.

[42]  R. Turner,et al.  Event-Related fMRI: Characterizing Differential Responses , 1998, NeuroImage.

[43]  J. O'Doherty,et al.  Marketing actions can modulate neural representations of experienced pleasantness , 2008, Proceedings of the National Academy of Sciences.

[44]  Sean M. Polyn,et al.  Beyond mind-reading: multi-voxel pattern analysis of fMRI data , 2006, Trends in Cognitive Sciences.

[45]  Michael J. Frank,et al.  Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning , 2007, Proceedings of the National Academy of Sciences.

[46]  Teck-Hua Ho,et al.  Experience-Weighted Attraction Learning in Games: A Unifying Approach , 1997 .

[47]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[48]  Karl J. Friston,et al.  Posterior probability maps and SPMs , 2003, NeuroImage.

[49]  W. Newsome,et al.  Matching Behavior and the Representation of Value in the Parietal Cortex , 2004, Science.

[50]  P. Dayan,et al.  Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control , 2005, Nature Neuroscience.

[51]  L. Wasserman,et al.  Computing Bayes Factors by Combining Simulation and Asymptotic Approximations , 1997 .

[52]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[53]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[54]  Kevin McCabe,et al.  Neural signature of fictive learning signals in a sequential investment task , 2007, Proceedings of the National Academy of Sciences.

[55]  Timothy E. J. Behrens,et al.  Learning the value of information in an uncertain world , 2007, Nature Neuroscience.

[56]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[57]  C. Bhat Quasi-random maximum simulated likelihood estimation of the mixed multinomial logit model , 2001 .

[58]  P. J. Huber The behavior of maximum likelihood estimates under nonstandard conditions , 1967 .

[59]  Michael J. Frank,et al.  By Carrot or by Stick: Cognitive Reinforcement Learning in Parkinsonism , 2004, Science.