Bayesian inference with Stan: A tutorial on adding custom distributions

When evaluating cognitive models based on fits to observed data (or, really, any model that has free parameters), parameter estimation is critically important. Traditional techniques like hill climbing by minimizing or maximizing a fit statistic often result in point estimates. Bayesian approaches instead estimate parameters as posterior probability distributions, and thus naturally account for the uncertainty associated with parameter estimation; Bayesian approaches also offer powerful and principled methods for model comparison. Although software applications such as WinBUGS (Lunn, Thomas, Best, & Spiegelhalter, Statistics and Computing, 10, 325–337, 2000) and JAGS (Plummer, 2003) provide “turnkey”-style packages for Bayesian inference, they can be inefficient when dealing with models whose parameters are correlated, which is often the case for cognitive models, and they can impose significant technical barriers to adding custom distributions, which is often necessary when implementing cognitive models within a Bayesian framework. A recently developed software package called Stan (Stan Development Team, 2015) can solve both problems, as well as provide a turnkey solution to Bayesian inference. We present a tutorial on how to use Stan and how to add custom distributions to it, with an example using the linear ballistic accumulator model (Brown & Heathcote, Cognitive Psychology, 57, 153–178. doi:10.1016/j.cogpsych.2007.12.002, 2008).

[1]  Andrew Gelman,et al.  The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..

[2]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[3]  Thomas J. Palmeri,et al.  Neurocognitive Modeling of Perceptual Decision Making , 2015 .

[4]  Andrew Gelman,et al.  Handbook of Markov Chain Monte Carlo , 2011 .

[5]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[6]  S. Wolf,et al.  The quality of evidence. , 1991, Integrative physiological and behavioral science : the official journal of the Pavlovian Society.

[7]  Joachim Vandekerckhove,et al.  Oxford Handbook of Computational and Mathematical Psychology , 2014 .

[8]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[9]  M. Lee,et al.  Bayesian Cognitive Modeling: A Practical Course , 2014 .

[10]  Paul Sajda,et al.  Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the EEG , 2009, Proceedings of the National Academy of Sciences.

[11]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[12]  Dustin Tran,et al.  Automatic Differentiation Variational Inference , 2016, J. Mach. Learn. Res..

[13]  Roger Ratcliff,et al.  Individual differences, aging, and IQ in two-choice tasks , 2010, Cognitive Psychology.

[14]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[15]  Scott D. Brown,et al.  Neural Correlates of Trial-to-Trial Fluctuations in Response Caution , 2011, The Journal of Neuroscience.

[16]  Brandon M. Turner,et al.  Informing cognitive abstractions through neuroimaging: the neural drift diffusion model. , 2015, Psychological review.

[17]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[18]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[19]  Simon Farrell,et al.  Computational Modeling in Cognition: Principles and Practice , 2010 .

[20]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[21]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[22]  Joachim Vandekerckhove,et al.  Extending JAGS: A tutorial on adding custom distributions to JAGS (with a diffusion model example) , 2013, Behavior Research Methods.

[23]  William A. Link,et al.  On thinning of chains in MCMC , 2012 .

[24]  P. Hewson Bayesian Data Analysis 3rd edn A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari and D. B. Rubin, 2013 Boca Raton, Chapman and Hall–CRC 676 pp., £44.99 ISBN 1‐439‐84095‐4 , 2015 .

[25]  Braden A. Purcell,et al.  From Salience to Saccades: Multiple-Alternative Gated Stochastic Accumulator Model of Visual Search , 2012, The Journal of Neuroscience.

[26]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[27]  Scott D. Brown,et al.  The simplest complete model of choice response time: Linear ballistic accumulation , 2008, Cognitive Psychology.

[28]  Roger Ratcliff,et al.  The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks , 2008, Neural Computation.

[29]  Brandon M. Turner,et al.  A method for efficiently sampling from distributions with correlated dimensions. , 2013, Psychological methods.

[30]  J. Kruschke Doing Bayesian Data Analysis: A Tutorial with R and BUGS , 2010 .

[31]  Matthew P. Wand,et al.  Fully simplified multivariate normal updates in non-conjugate variational message passing , 2014, J. Mach. Learn. Res..

[32]  Scott D. Brown,et al.  The overconstraint of response time models: Rethinking the scaling problem , 2009, Psychonomic bulletin & review.

[33]  Brandon M. Turner,et al.  Approximate Bayesian computation with differential evolution , 2012 .