Naïve and Robust: Class-Conditional Independence in Human Classification Learning

Humans excel in categorization. Yet from a computational standpoint, learning a novel probabilistic classification task involves severe computational challenges. The present paper investigates one way to address these challenges: assuming class-conditional independence of features. This feature independence assumption simplifies the inference problem, allows for informed inferences about novel feature combinations, and performs robustly across different statistical environments. We designed a new Bayesian classification learning model (the dependence-independence structure and category learning model, DISC-LM) that incorporates varying degrees of prior belief in class-conditional independence, learns whether or not independence holds, and adapts its behavior accordingly. Theoretical results from two simulation studies demonstrate that classification behavior can appear to start simple, yet adapt effectively to unexpected task structures. Two experiments-designed using optimal experimental design principles-were conducted with human learners. Classification decisions of the majority of participants were best accounted for by a version of the model with very high initial prior belief in class-conditional independence, before adapting to the true environmental structure. Class-conditional independence may be a strong and useful default assumption in category learning tasks.

[1]  Adam N. Sanborn,et al.  Bridging Levels of Analysis for Probabilistic Models of Cognition , 2012 .

[2]  Gregory F. Cooper,et al.  A Bayesian Network Classifier that Combines a Finite Mixture Model and a NaIve Bayes Model , 1999, UAI.

[3]  H. Kunkel GENERAL INTRODUCTION , 1971, The Journal of experimental medicine.

[4]  Russell C. Burnett,et al.  Feature inference and the causal structure of categories , 2005, Cognitive Psychology.

[5]  C. Lintott,et al.  Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey , 2008, 0804.4483.

[6]  J. Woodward,et al.  Independence, Invariance and the Causal Markov Condition , 1999, The British Journal for the Philosophy of Science.

[7]  Robert M Nosofsky,et al.  Speeded classification in a probabilistic category structure: contrasting exemplar-retrieval, decision-boundary, and prototype models. , 2005, Journal of experimental psychology. Human perception and performance.

[8]  Robert M Nosofsky,et al.  Limitations of exemplar models of multi-attribute probabilistic inference. , 2007, Journal of experimental psychology. Learning, memory, and cognition.

[9]  M. Waldmann,et al.  A Bayesian Network Model of Causal Learning , 1999 .

[10]  Y. Rosseel Mixture models of categorization , 2002 .

[11]  John R. Anderson Is human cognition adaptive? , 1991, Behavioral and Brain Sciences.

[12]  Jana Jarecki,et al.  What is a Cognitive Process Model? A Disambiguation , 2015 .

[13]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[14]  Janet Hui-wen Hsiao,et al.  NIMBLE: a kernel density model of saccade-based visual memory. , 2008, Journal of vision.

[15]  F. Gregory Ashby,et al.  Complex decision rules in categorization : contrasting novice and experienced performance , 1992 .

[16]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.

[17]  M. Lee,et al.  A Bayesian hierarchical mixture approach to individual differences: Case studies in selective attention and representation in category learning ☆ , 2014 .

[18]  Jonathan D. Nelson,et al.  Experience Matters , 2010, Psychological science.

[19]  D. Homa,et al.  Expanding the search for a linear separability constraint on category learning , 2001, Memory & cognition.

[20]  Adam N. Sanborn,et al.  Unifying rational models of categorization via the hierarchical Dirichlet process , 2019 .

[21]  Jonathan D. Nelson,et al.  Information search with situation-specific reward functions , 2012, Judgment and Decision Making.

[22]  Michael T. Hannan Problems of Aggregation , 2017 .

[23]  W. Estes,et al.  Risks of drawing inferences about cognitive processes from model fits to individual versus average performance , 2005, Psychonomic bulletin & review.

[24]  J. Rieskamp,et al.  SSL: a theory of how people learn to select strategies. , 2006, Journal of experimental psychology. General.

[25]  R. Nosofsky,et al.  Combining exemplar-based category representations and connectionist learning rules. , 1992, Journal of experimental psychology. Learning, memory, and cognition.

[26]  Mark W. Altom,et al.  Given versus induced category representations: use of prototype and exemplar information in classification. , 1984, Journal of experimental psychology. Learning, memory, and cognition.

[27]  Ronaldo Vigo The GIST of concepts , 2013, Cognition.

[28]  Thomas L. Griffiths,et al.  A Rational Analysis of Rule-Based Concept Learning , 2008, Cogn. Sci..

[29]  J. Kruschke,et al.  Rules and exemplars in category learning. , 1998, Journal of experimental psychology. General.

[30]  Julian N. Marewski,et al.  Cognitive niches: an ecological model of strategy selection. , 2011, Psychological review.

[31]  Charles Kemp,et al.  Bayesian models of cognition , 2008 .

[32]  York Hagmayer,et al.  Transitive reasoning distorts induction in causal chains , 2016, Memory & cognition.

[33]  Safa R. Zaki,et al.  Exemplar and prototype models revisited: response strategies, selective attention, and stimulus generalization. , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[34]  Carol A. Seger,et al.  The Roles of the Caudate Nucleus in Human Classification Learning , 2005, The Journal of Neuroscience.

[35]  J. Rieskamp,et al.  Neural evidence for adaptive strategy selection in value-based decision-making. , 2014, Cerebral cortex.

[36]  Mark K. Johansen,et al.  Are there representational shifts during category learning? , 2002, Cognitive Psychology.

[37]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[38]  F. Gregory Ashby,et al.  Complex decision rules in categorization : contrasting novice and experienced performance , 1992 .

[39]  Iain D Gilchrist,et al.  Oculomotor capture by transient events: a comparison of abrupt onsets, offsets, motion, and flicker. , 2008, Journal of vision.

[40]  P. Juslin,et al.  Exemplar effects in categorization and multiple-cue judgment. , 2003, Journal of experimental psychology. General.

[41]  D. Medin,et al.  SUSTAIN: a network model of category learning. , 2004, Psychological review.

[42]  Emmanuel M. Pothos,et al.  One or two dimensions in spontaneous classification: A simplicity approach , 2008, Cognition.

[43]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Peter A. Flach,et al.  Naive Bayesian Classification of Structured Data , 2004, Machine Learning.

[45]  Thomas L. Griffiths,et al.  A more rational model of categorization , 2006 .

[46]  P. Todd,et al.  Simple Heuristics That Make Us Smart , 1999 .

[47]  Aaron B. Hoffman,et al.  Eyetracking and selective attention in category learning , 2005, Cognitive Psychology.

[48]  Jörg Rieskamp,et al.  When easy comes hard: the development of adaptive strategy selection. , 2011, Child development.

[49]  Thomas L. Griffiths,et al.  When to use which heuristic: A rational solution to the strategy selection problem , 2015, CogSci.

[50]  D. Titterington,et al.  Comparison of Discrimination Techniques Applied to a Complex Data Set of Head Injured Patients , 1981 .

[51]  Laura Martignon,et al.  Naive and Yet Enlightened: From Natural Frequencies to Fast and Frugal Decision Trees , 2003 .

[52]  Adam N Sanborn,et al.  Rational approximations to rational models: alternative algorithms for category learning. , 2010, Psychological review.

[53]  Joshua B. Tenenbaum,et al.  A probabilistic model of cross-categorization , 2011, Cognition.

[54]  J. D. Smith,et al.  Distinguishing prototype-based and exemplar-based processes in dot-pattern category learning. , 2002, Journal of experimental psychology. Learning, memory, and cognition.

[55]  D. Medin,et al.  Linear separability in classification learning. , 1981 .

[56]  Douglas L. Medin,et al.  Context theory of classification learning. , 1978 .

[57]  Michael M. Cohen,et al.  A comparison of learning models , 1995 .

[58]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[59]  Douglas L Medin,et al.  Linear separability and concept learning: Context, relational properties, and concept naturalness , 1986, Cognitive Psychology.

[60]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[61]  Ralf Mayrhofer,et al.  Agents and Causes: Dispositional Intuitions As a Guide to Causal Structure , 2015, Cogn. Sci..

[62]  Charles X. Ling,et al.  Geometric Properties of Naive Bayes in Nominal Domains , 2001, ECML.

[63]  Mark A McDaniel,et al.  Individual differences in learning and transfer: stable tendencies for learning exemplars versus abstracting rules. , 2014, Journal of experimental psychology. General.

[64]  Lael J. Schooler,et al.  A signal-detection analysis of fast-and-frugal trees. , 2011, Psychological review.

[65]  Andreas Glöckner,et al.  Modeling Option and Strategy Choices with Connectionist Networks: Towards an Integrative Model of Automatic and Deliberate Decision Making , 2008, Judgment and Decision Making.

[66]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[67]  N. Perrin,et al.  Varieties of perceptual independence. , 1986, Psychological review.

[68]  Liva Nohre,et al.  Instance Frequency, Categorization, and the Modulating Effect of Experience , 1991 .

[69]  M. Posner,et al.  Retention of Abstract Ideas. , 1970 .

[70]  Rick Mehta,et al.  Elemental and configural processing of novel cues in deterministic and probabilistic tasks , 2002 .

[71]  Andy J. Wills,et al.  Models of Categorization , 2013 .

[72]  B. Rehder Independence and dependence in human causal reasoning , 2014, Cognitive Psychology.

[73]  Nancy Cartwright,et al.  Marks and Probabilities: Two Ways to Find Causal Structure , 1993 .

[74]  Stephen K. Reed,et al.  Pattern recognition and categorization , 1972 .

[75]  Edward E. Smith,et al.  Strategies and classification learning , 1981 .

[76]  James A. Kole,et al.  Strategy shifts in classification skill acquisition: Does memory retrieval dominate rule use? , 2006, Memory & cognition.

[77]  Jacob Feldman,et al.  Conceptual complexity and the bias/variance tradeoff , 2011, Cognition.

[78]  Jonathan D. Nelson Finding useful questions: on Bayesian diagnosticity, probability, impact, and information gain. , 2005, Psychological review.

[79]  Jay I. Myung,et al.  Optimal experimental design for model discrimination. , 2009, Psychological review.

[80]  Benjamin M. Rottman,et al.  Do people reason rationally about causally related events? Markov violations, weak inferences, and failures of explaining away , 2016, Cognitive Psychology.

[81]  M. Bar-Hillel The base-rate fallacy in probability judgments. , 1980 .

[82]  James L. McClelland,et al.  The Morton-Massaro law of information integration: implications for models of perception. , 2001, Psychological review.

[83]  S. Sloman,et al.  Mechanistic beliefs determine adherence to the Markov property in causal reasoning , 2013, Cognitive Psychology.

[84]  R. Nosofsky Exemplars, prototypes, and similarity rules. , 1992 .

[85]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[86]  R. Nosofsky,et al.  Rule-plus-exception model of classification learning. , 1994, Psychological review.

[87]  J. D. Smith,et al.  Prototypes in the Mist: The Early Epochs of Category Learning , 1998 .

[88]  M. Posner,et al.  On the genesis of abstract ideas. , 1968, Journal of experimental psychology.

[89]  Jan K. Woike,et al.  Journal of Mathematical Psychology Categorization with Limited Resources: a Family of Simple Heuristics , 2022 .

[90]  Daniel R. Little,et al.  Better learning with more error: probabilistic feedback increases sensitivity to correlated cues in categorization. , 2009, Journal of experimental psychology. Learning, memory, and cognition.

[91]  W. Vanpaemel,et al.  In search of abstraction: The varying abstraction model of categorization , 2008, Psychonomic bulletin & review.

[92]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[93]  J. D. Smith,et al.  "Prototypes in the mist: The early epochs of category learning": Correction to Smith and Minda (1998). , 1999 .

[94]  John W. Payne,et al.  The adaptive decision maker: Name index , 1993 .