论文信息 - The pigeon as particle filter

The pigeon as particle filter

Although theorists have interpreted classical conditioning as a laboratory model of Bayesian belief updating, a recent reanalysis showed that the key features that theoretical models capture about learning are artifacts of averaging over subjects. Rather than learning smoothly to asymptote (reflecting, according to Bayesian models, the gradual tradeoff from prior to posterior as data accumulate), subjects learn suddenly and their predictions fluctuate perpetually. We suggest that abrupt and unstable learning can be modeled by assuming subjects are conducting inference using sequential Monte Carlo sampling with a small number of samples — one, in our simulations. Ensemble behavior resembles exact Bayesian models since, as in particle filters, it averages over many samples. Further, the model is capable of exhibiting sophisticated behaviors like retrospective revaluation at the ensemble level, even given minimally sophisticated individuals that do not track uncertainty in their beliefs over trials.

Aaron C. Courville | N. Daw

[1] R. Rescorla. A theory of pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement , 1972 .

[2] D. Shanks. Forward and Backward Blocking in Human Contingency Judgement , 1985 .

[3] D. Rubin. Using the SIR algorithm to simulate posterior distributions , 1988 .

[4] R. R. Miller,et al. Biological significance in forward and backward blocking: resolution of a discrepancy between animal conditioning and human causal judgment. , 1996, Journal of experimental psychology. General.

[5] Simon J. Godsill,et al. On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[6] Peter Dayan,et al. Explaining Away in Weight Space , 2000, NIPS.

[7] Peter Dayan,et al. Expected and Unexpected Uncertainty: ACh and NE in the Neocortex , 2002, NIPS.

[8] S. Kakade,et al. Acquisition and extinction in autoshaping. , 2002, Psychological review.

[9] David S. Touretzky,et al. Model Uncertainty in Classical Conditioning , 2003, NIPS.

[10] C. Gallistel,et al. The learning curve: implications of a quantitative analysis. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11] David S. Touretzky,et al. Similarity and Discrimination in Classical Conditioning: A Latent Variable Account , 2004, NIPS.

[12] J. Tenenbaum,et al. Structure and strength in causal induction , 2005, Cognitive Psychology.

[13] C. Gallistel,et al. Pavlovian contingencies and temporal information. , 2006, Journal of experimental psychology. Animal behavior processes.

[14] Paul R. Schrater,et al. Theory and Dynamics of Perceptual Bistability , 2006, NIPS.

[15] Thomas L. Griffiths,et al. A more rational model of categorization , 2006 .

[16] K. Doya,et al. The computational neurobiology of learning and reward , 2006, Current Opinion in Neurobiology.