A theory of learning to infer

Bayesian theories of cognition assume that people can integrate probabilities rationally. However, several empirical findings contradict this proposition: human probabilistic inferences are prone to systematic deviations from optimality. Puzzlingly, these deviations sometimes go in opposite directions. Whereas some studies suggest that people under-react to prior probabilities (base rate neglect), other studies find that people under-react to the likelihood of the data (conservatism). We argue that these deviations arise because the human brain does not rely solely on a general-purpose mechanism for approximating Bayesian inference that is invariant across queries. Instead, the brain is equipped with a recognition model that maps queries to probability distributions. The parameters of this recognition model are optimized to get the output as close as possible, on average, to the true posterior. Because of our limited computational resources, the recognition model will allocate its resources so as to be more accurate for high probability queries than for low probability queries. By adapting to the query distribution, the recognition model “learns to infer.” We show that this theory can explain why and when people under-react to the data or the prior, and a new experiment demonstrates that these two forms of under-reaction can be systematically controlled by manipulating the query distribution. The theory also explains a range of related phenomena: memory effects, belief bias, and the structure of response variability in probabilistic reasoning. We also discuss how the theory can be integrated with prior sampling-based accounts of approximate inference.

[1]  S. Gershman On the Blessing of Abstraction , 2017, Quarterly journal of experimental psychology.

[2]  D. Faust,et al.  The Base-Rate Fallacy in School Psychology , 1997 .

[3]  Stuart J. Russell,et al.  Meta-Learning MCMC Proposals , 2017, NeurIPS.

[4]  Scott W. Linderman,et al.  Variational Sequential Monte Carlo , 2017, AISTATS.

[5]  B. Newell Judgment Under Uncertainty , 2013 .

[6]  R. Hertwig,et al.  Experience and Description: Exploring Two Paths to Knowledge , 2018 .

[7]  D. Sperber,et al.  The Enigma of Reason , 2017 .

[8]  A. Tversky,et al.  The weighing of evidence and the determinants of confidence , 1992, Cognitive Psychology.

[9]  L. Beach,et al.  A Contingency Model for the Selection of Decision Strategies , 1978 .

[10]  Noah D. Goodman,et al.  Learning Stochastic Inverses , 2013, NIPS.

[11]  I. Erev,et al.  On adaptation, maximization, and reinforcement learning among cognitive strategies. , 2005, Psychological review.

[12]  John C. Trueswell,et al.  Proceedings of the 38th Annual Conference of the Cognitive Science Society (pp. 432-437) Cognitive Science Society. , 2016 .

[13]  Samuel J. Gershman,et al.  Complex Probabilistic Inference , 2017 .

[14]  Ardavan Saeedi,et al.  Variational Particle Approximations , 2014, J. Mach. Learn. Res..

[15]  C. Peterson,et al.  SAMPLE SIZE AND THE REVISION OF SUBJECTIVE PROBABILITIES. , 1965, Journal of experimental psychology.

[16]  Noah D. Goodman,et al.  Remembrance of inferences past: Amortization in human hypothesis generation , 2018, Cognition.

[17]  Michael H. Birnbaum,et al.  Bayesian Inference : Combining Base Rates With Opinions of Sources Who Vary in Credibility , 2005 .

[18]  Maarten Speekenbrink,et al.  Simple trees in complex forests: Growing Take The Best by Approximate Bayesian Computation , 2016, CogSci.

[19]  S. Gershman,et al.  Where do hypotheses come from? , 2017, Cognitive Psychology.

[20]  N. Chater,et al.  Précis of Bayesian Rationality: The Probabilistic Approach to Human Reasoning , 2009, Behavioral and Brain Sciences.

[21]  Falk Lieder,et al.  Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources , 2019, Behavioral and Brain Sciences.

[22]  Rick P. Thomas,et al.  Diagnostic hypothesis generation and human judgment. , 2008, Psychological review.

[23]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[24]  Ida Momennejad,et al.  Offline replay supports planning in human reinforcement learning , 2018, eLife.

[25]  David R. Shanks,et al.  A connectionist account of base-rate biases in categorization , 1991 .

[26]  N. Barberis,et al.  A Model of Investor Sentiment , 1997 .

[27]  James L. McClelland,et al.  Familiarity breeds differentiation: a subjective-likelihood approach to the effects of experience in recognition memory. , 1998, Psychological review.

[28]  A. Markman,et al.  Journal of Experimental Psychology : General Retrospective Revaluation in Sequential Decision Making : A Tale of Two Systems , 2012 .

[29]  Jonathan Evans,et al.  The source of belief bias effects in syllogistic reasoning , 1992, Cognition.

[30]  Frank D. Wood,et al.  Inference Networks for Sequential Monte Carlo in Graphical Models , 2016, ICML.

[31]  S. Denison,et al.  Rational variability in children’s causal inferences: The Sampling Hypothesis , 2013, Cognition.

[32]  A. Staub,et al.  Beliefs and Bayesian reasoning , 2017, Psychonomic bulletin & review.

[33]  D. Benjamin,et al.  Errors in Probabilistic Reasoning and Judgment Biases , 2018 .

[34]  Donald V. Moser,et al.  Do Asset Market Prices Reflect Traders' Judgment Biases? , 2000 .

[35]  Qiang Liu,et al.  Approximate Inference with Amortised MCMC , 2017, ArXiv.

[36]  P. Johnson-Laird,et al.  Believability and syllogistic reasoning , 1989, Cognition.

[37]  Julian N. Marewski,et al.  Strategy selection: An introduction to the modeling challenge. , 2014, Wiley interdisciplinary reviews. Cognitive science.

[38]  F. Strack,et al.  Ease of retrieval as information: Another look at the availability heuristic. , 1991 .

[39]  Noah D. Goodman,et al.  The anchoring bias reflects rational use of cognitive resources , 2018, Psychonomic bulletin & review.

[40]  F. Costello,et al.  Surprising rationality in probability judgment: Assessing two competing models , 2018, Cognition.

[41]  Paul Slovic,et al.  Comparison of Bayesian and Regression Approaches to the Study of Information Processing in Judgment. , 1971 .

[42]  Gerd Gigerenzer,et al.  Homo Heuristicus: Why Biased Minds Make Better Inferences , 2009, Top. Cogn. Sci..

[43]  J D Cohen,et al.  Multitasking versus multiplexing: Toward a normative account of limitations in the simultaneous execution of control-demanding behaviors , 2014, Cognitive, affective & behavioral neuroscience.

[44]  Thomas L. Griffiths,et al.  Formalizing Neurath’s Ship: Approximate Algorithms for Online Causal Learning , 2016, Psychological review.

[45]  M. Rabin,et al.  A MODEL OF NONBELIEF IN THE LAW OF LARGE NUMBERS. , 2016, Journal of the European Economic Association.

[46]  Julian N. Marewski,et al.  Cognitive niches: an ecological model of strategy selection. , 2011, Psychological review.

[47]  W. Edwards,et al.  Sampling distributions and probability revisions. , 1968, Journal of experimental psychology.

[48]  John W. Payne,et al.  Effort and Accuracy in Choice , 1985 .

[49]  R. Dawes,et al.  Equating Inverse Probabilities in Implicit Personality Judgments , 1993 .

[50]  Cade Massey,et al.  Detecting Regime Shifts: The Causes of Under- and Overreaction , 2005, Manag. Sci..

[51]  I. Ajzen Intuitive theories of events and the effects of base-rate information on prediction. , 1977 .

[52]  József Fiser,et al.  Perceptual Decision-Making as Probabilistic Inference by Neural Sampling , 2014, Neuron.

[53]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[54]  Daniel Kahneman,et al.  Availability: A heuristic for judging frequency and probability , 1973 .

[55]  P. Slovic,et al.  Dominance of accuracy information and neglect of base rates in probability estimation , 1976 .

[56]  B. Fischhoff,et al.  Subjective sensitivity analysis. , 1979 .

[57]  Peter M. Todd,et al.  Testing the ecological rationality of base rate neglect , 2002 .

[58]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[59]  Tevye R. Krynski,et al.  The role of causality in judgment under uncertainty. , 2007, Journal of experimental psychology. General.

[60]  B. Love,et al.  Heuristics as Bayesian inference under extreme priors , 2018, Cognitive Psychology.

[61]  R. Shiffrin,et al.  A model for recognition memory: REM—retrieving effectively from memory , 1997, Psychonomic bulletin & review.

[62]  S. Sloman,et al.  Base-rate respect: From ecological rationality to dual processes. , 2007, The Behavioral and brain sciences.

[63]  Gerd Gigerenzer,et al.  Why Heuristics Work , 2008, Perspectives on psychological science : a journal of the Association for Psychological Science.

[64]  Jian-Qiao Zhu,et al.  The Bayesian Sampler: Generic Bayesian Inference Causes Incoherence in Human Probability Judgments , 2018, Psychological review.

[65]  Martin Weber,et al.  How Do People Take into Account Weight, Strength and Quality of Segregated vs. Aggregated Data? Experimental Evidence , 2004 .

[66]  Adam N. Sanborn,et al.  Bayesian Brains without Probabilities , 2016, Trends in Cognitive Sciences.

[67]  Joshua B. Tenenbaum,et al.  Learning list concepts through program induction , 2018, bioRxiv.

[68]  Falk Lieder,et al.  Overrepresentation of Extreme Events in Decision Making Reflects Rational Use of Cognitive Resources , 2017, Psychological review.

[69]  Samuel J. Gershman,et al.  Representation learning with reward prediction errors , 2019, Neurons, Behavior, Data analysis, and Theory.

[70]  B. Fischhoff,et al.  Hypothesis Evaluation from a Bayesian Perspective. , 1983 .

[71]  A. Tversky,et al.  Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment , 1983 .

[72]  M. Birnbaum Base Rates in Bayesian Inference: Signal Detection Analysis of the Cab Problem , 1983 .

[73]  Daniel M. Oppenheimer,et al.  Heuristics made easy: an effort-reduction framework. , 2008, Psychological bulletin.

[74]  Wei Ji Ma,et al.  Efficient Probabilistic Inference in Generic Neural Networks Trained with Non-Probabilistic Feedback , 2018 .

[75]  Noga Alon,et al.  A graph-theoretic approach to multitasking , 2016, NIPS.

[76]  Gordon D. A. Brown,et al.  Decision by sampling , 2006, Cognitive Psychology.

[77]  A. Tversky,et al.  A Belief-Based Account of Decision Under Uncertainty , 1998 .

[78]  A. Tversky,et al.  Judgment under Uncertainty , 1982 .

[79]  John W. Senders,et al.  On Theories of Error , 2020 .

[80]  M. Hilbert,et al.  Toward a synthesis of cognitive biases: how noisy information processing can bias human decision making. , 2012, Psychological bulletin.

[81]  Thomas L. Griffiths,et al.  One and Done? Optimal Decisions From Very Few Samples , 2014, Cogn. Sci..

[82]  Radford M. Neal Optimal Proposal Distributions and Adaptive MCMC , 2011 .

[83]  C. Ofir Pseudodiagnosticity in judgment under uncertainty , 1988 .

[84]  Scott A. Sisson,et al.  Reversible Jump MCMC , 2011 .

[85]  H. Simon,et al.  A Behavioral Model of Rational Choice , 1955 .

[86]  D. Kahneman,et al.  Do Frequency Representations Eliminate Conjunction Effects? An Exercise in Adversarial Collaboration , 2001, Psychological science.

[87]  Gerd Gigerenzer,et al.  Heuristic decision making. , 2011, Annual review of psychology.

[88]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[89]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[90]  Ward Edwards,et al.  Judgment under uncertainty: Conservatism in human information processing , 1982 .

[91]  M. Bar-Hillel The base-rate fallacy in probability judgments. , 1980 .

[92]  W. Edwards,et al.  Conservatism in a simple probability inference task. , 1966, Journal of experimental psychology.

[93]  B. Scassellati,et al.  Proceedings of the 36th Annual Conference of the Cognitive Science Society , 2014 .

[94]  Noah D. Goodman,et al.  Empirical evidence for resource-rational anchoring and adjustment , 2017, Psychonomic Bulletin & Review.

[95]  Yisong Yue,et al.  Learning to Infer , 2018, ICLR.

[96]  Andrei Shleifer,et al.  What Comes to Mind , 2009 .

[97]  Michael A. Osborne,et al.  Probabilistic numerics and uncertainty in computations , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[98]  Eric Schulz,et al.  A theory of learning to infer. , 2020, Psychological review.

[99]  Pushmeet Kohli,et al.  Just-In-Time Learning for Fast and Flexible Inference , 2014, NIPS.

[100]  Carl E. Rasmussen,et al.  Bayesian Monte Carlo , 2002, NIPS.

[101]  P. Todd,et al.  Environments That Make Us Smart , 2007 .

[102]  J. Rieskamp,et al.  SSL: a theory of how people learn to select strategies. , 2006, Journal of experimental psychology. General.

[103]  E. Brunswik Representative design and probabilistic theory in a functional psychology. , 1955, Psychological review.

[104]  Adam Binch,et al.  Perception as Bayesian Inference , 2014 .

[105]  R. Hertwig,et al.  The description–experience gap in risky choice , 2009, Trends in Cognitive Sciences.

[106]  Adam N. Sanborn,et al.  Bridging Levels of Analysis for Probabilistic Models of Cognition , 2012 .

[107]  Joshua B. Tenenbaum,et al.  Grounding Compositional Hypothesis Generation in Specific Instances , 2018, CogSci.

[108]  B. Fischhoff,et al.  Diagnosticity and the base-rate effect , 1984, Memory & cognition.

[109]  I. Erev,et al.  Simultaneous Over- and Underconfidence: The Role of Error in Judgment Processes. , 1994 .

[110]  Joshua B. Tenenbaum,et al.  Multistability and Perceptual Inference , 2012, Neural Computation.

[111]  Wolfgang Maass,et al.  Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons , 2011, PLoS Comput. Biol..

[112]  D. Eddy Judgment under uncertainty: Probabilistic reasoning in clinical medicine: Problems and opportunities , 1982 .

[113]  Samuel Gershman,et al.  Imaginative Reinforcement Learning: Computational Principles and Neural Mechanisms , 2017, Journal of Cognitive Neuroscience.

[114]  Du Charme,et al.  A response bias explanation of conservative human inference , 1969 .

[115]  T. Griffiths,et al.  Strategy Selection as Rational Metareasoning , 2017, Psychological review.

[116]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[117]  Richard E. Turner,et al.  Neural Adaptive Sequential Monte Carlo , 2015, NIPS.

[118]  Merilyn Grinnell,et al.  Bayesian predictions of faculty judgments of graduate school success , 1971 .

[119]  Jonathan Evans,et al.  Background beliefs in Bayesian inference , 2002, Memory & cognition.

[120]  L. Beach,et al.  Subjective sampling distributions and conservatism , 1968 .

[121]  R. Hertwig,et al.  The priority heuristic: making choices without trade-offs. , 2006, Psychological review.

[122]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[123]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[124]  Z. J. Ulehla,et al.  UNCERTAINTY, INFERENCE DIFFICULTY, AND PROBABILITY LEARNING. , 1964, Journal of experimental psychology.

[125]  C. Peterson,et al.  MODE, MEDIAN, AND MEAN AS OPTIMAL STRATEGIES. , 1964, Journal of experimental psychology.

[126]  T. Griffiths Revealing ontological commitments by magic , 2015, Cognition.

[127]  James A. Wise,et al.  Sample proportions and subjective probability revisions , 1970 .

[128]  James F. Smith,et al.  A contingency model for the selection of decision strategies : some extensions and empirical tests , 1980 .

[129]  G. Gigerenzer On Narrow Norms and Vague Heuristics: A Reply to Kahneman and Tversky (1996) , 1996 .

[130]  Karl J. Friston Hierarchical Models in the Brain , 2008, PLoS Comput. Biol..

[131]  Rick P. Thomas,et al.  Science Current Directions in Psychological Memory Constraints on Hypothesis Generation and Decision Making on Behalf Of: Association for Psychological Science , 2022 .

[132]  D. M. Grether,et al.  Bayes Rule as a Descriptive Model: The Representativeness Heuristic , 1980 .

[133]  M. Dhami Journal of Behavioral Decision Making J. Behav. Dec. Making, 14: 141±168 �2001) DOI: 10.1002/bdm.371 Bailing and Jailing the Fast and Frugal Way , 2022 .

[134]  Toshiji Kawagoe,et al.  Belief Updating in Individual and Social Learning: A Field Experiment on the Internet , 2007 .

[135]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[136]  G Gigerenzer,et al.  Reasoning the fast and frugal way: models of bounded rationality. , 1996, Psychological review.

[137]  A. Shleifer,et al.  Memory, Attention, and Choice , 2017 .

[138]  Jan Peters,et al.  Catching heuristics are optimal control policies , 2016, NIPS.

[139]  D. Mandel,et al.  The inverse fallacy: An account of deviations from Bayes’s theorem and the additivity principle , 2002, Memory & cognition.

[140]  V. Reyna,et al.  Physician decision making and cardiac risk: effects of knowledge, risk perception, risk tolerance, and fuzzy processing. , 2006, Journal of experimental psychology. Applied.

[141]  J. Tenenbaum,et al.  Optimal Predictions in Everyday Cognition , 2006, Psychological science.

[142]  G. Bower,et al.  From conditioning to category learning: an adaptive network model. , 1988, Journal of experimental psychology. General.

[143]  Francisco J. R. Ruiz,et al.  A Contrastive Divergence for Combining Variational Inference and MCMC , 2019, ICML.

[144]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[145]  Konrad Paul Kording,et al.  Review TRENDS in Cognitive Sciences Vol.10 No.7 July 2006 Special Issue: Probabilistic models of cognition Bayesian decision theory in sensorimotor control , 2022 .

[146]  Adam N Sanborn,et al.  Exemplar models as a mechanism for performing Bayesian inference , 2010, Psychonomic bulletin & review.

[147]  P. Pollard,et al.  On the conflict between logic and belief in syllogistic reasoning , 1983, Memory & cognition.

[148]  A. Tversky,et al.  On the psychology of prediction , 1973 .

[149]  Nando de Freitas,et al.  Variational MCMC , 2001, UAI.

[150]  Sudeep Bhatia Associative Judgment and Vector Space Semantics , 2017, Psychological review.

[151]  C. Gettys,et al.  MINERVA-DM: A memory processes model for judgments of likelihood. , 1999 .

[152]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[153]  Samuel J. Gershman,et al.  Compositional inductive biases in function learning , 2016, Cognitive Psychology.

[154]  J. Koehler The base rate fallacy reconsidered: Descriptive, normative, and methodological challenges , 1996, Behavioral and Brain Sciences.

[155]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[156]  James C. R. Whittington,et al.  Theories of Error Back-Propagation in the Brain , 2019, Trends in Cognitive Sciences.

[157]  J. Tenenbaum,et al.  The Rational Basis of Representatives , 2001 .

[158]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[159]  A. Tversky,et al.  Subjective Probability: A Judgment of Representativeness , 1972 .

[160]  Samuel J. Gershman,et al.  Computational rationality: A converging paradigm for intelligence in brains, minds, and machines , 2015, Science.

[161]  C. Peterson,et al.  SENSITIVITY OF SUBJECTIVE PROBABILITY REVISION. , 1965, Journal of experimental psychology.

[162]  A. Tversky,et al.  Support theory: A nonextensional representation of subjective probability. , 1994 .

[163]  F. Frick,et al.  The relationship between attitudes toward conclusions and errors in judging logical validity of syllogisms. , 1943 .

[164]  József Fiser,et al.  Neural Variability and Sampling-Based Probabilistic Representations in the Visual Cortex , 2016, Neuron.