The Assumption of Class-conditional Independence in Category Learning

The Assumption of Class-Conditional Independence in Category Learning Jana Jarecki (jarecki@mpib-berlin.mpg.de) ∗ Bj¨orn Meder (meder@mpib-berlin.mpg.de) ∗ Jonathan D. Nelson (nelson@mpib-berlin.mpg.de) ∗ ∗ Center for Adaptive Cognition and Behavior (ABC), Max Planck Institute for Human Development, Lentzeallee 94 14195 Berlin, Germany Abstract in Reichenbach’s (1956) common-cause principle in the phi- losophy of science and in causal modeling (Spirtes, Glymour, & Scheines, 1993; Pearl, 2000). Both the philosophical and psychological literature make claims about the normative bases of the assumption of class- conditional-independence of features. Our focus here is not on the general normativity or nonnormativity of that assump- tion, but on whether the assumption of class-conditional inde- pendence may (perhaps tacitly) underlie people’s inferences in learning and multiple-cue categorization tasks. We think of this assumption as one of many possible default (heuris- tic or meta-heuristic) assumptions that, if close enough to an environment’s actual structure, may facilitate learning and in- ferences. This paper investigates the role of the assumption of class- conditional independence of object features in human classi- fication learning. This assumption holds that object feature values are statistically independent of each other, given knowl- edge of the object’s true category. Treating features as class- conditionally independent can in many situations substantially facilitate learning and categorization even if the assumption is not perfectly true. Using optimal experimental design princi- ples, we designed a task to test whether people have this de- fault assumption when learning to categorize. Results provide some supporting evidence, although the data are mixed. What is clear is that classification behavior adapts to the structure of the environment: a category structure that is unlearnable under the assumption of class-conditional independence is learned by all participants. Keywords: Multiple-cue classification learning; class- conditional independence; na¨ive Bayes; causal Markov con- dition The Psychology of Conditional Independence Introduction Categorization is fundamental for cognition. Grouping to- gether objects or events helps us to efficiently encode envi- ronmental patterns, make inferences about unobserved prop- erties of novel instances, and make decisions. Without cate- gorization we could not see the woods for the trees. Despite the ease with which we form categories and use them to make inferences or judgments, from a computational perspective categorization is a challenging problem. For in- stance, different diseases can cause similar symptoms, en- tailing that diagnostic inferences are often only probabilistic. Patients may have new symptom combinations and still re- quire a diagnosis. Depending on the specific assumptions the physician makes about the relationship between the diseases and symptoms, a physician could justifiably make very dif- ferent inferences about the diseases. In the present paper, we investigate the role of the possi- ble assumption of class-conditional independence of features in category learning. Class-conditional independence holds if the features of the category members are statistically indepen- dent given the true class. This assumption can facilitate clas- sification and learning of category structures. The concept of class-conditional independence underlies the na¨ive Bayes classifier in machine learning (Domingos & Pazzani, 1997), and is also a key assumption in some psychological classifica- tion models (e.g., Fried & Holyoak, 1984; Anderson, 1991). It is related to ideas of channel separability in sensory percep- tion (Movellan & McClelland, 2001). Similar ideas are found Some psychological models of categorization incorporate as- sumptions of class-conditional independence, such as the cat- egory density model (Fried & Holyoak, 1984) or Anderson’s (1991) rational model of categorization. Both models treat features of instances as class-conditionally independent to make inferences about category membership or unobserved item properties. Other research has focused more directly on the role of conditional independence assumptions in human reasoning. For instance, a key assumption in many formal causal mod- eling approaches (e.g., Pearl, 2000; Spirtes et al., 1993) is the so-called causal Markov condition, which assumes that a variable in a causal network is independent of all other vari- ables (except for its causal descendants), conditional on its di- rect causes. As this assumption facilitates probabilistic infer- ences across complex causal networks it was suggested that people’s causal inferences could also comply with this condi- tional independence assumption. Von Sydow, Meder, and Hagmayer (2009) investigated reasoning about causal chains and found that subjects’ infer- ences indicated a use of conditional independence assump- tions, even if the learning data suggested otherwise. 1 Other research, however, found violations of the causal Markov condition (Rehder & Burnett, 2005). Asked to infer the prob- 1 For instance, applying the causal Markov condition to a causal chain X → Y → Z entails that Z is independent of X given Y (e.g., P(z|y, x) = P(z|y, ¬x).

[1]  Jay I. Myung,et al.  Optimal experimental design for model discrimination. , 2009, Psychological review.

[2]  J. D. Smith,et al.  Prototypes in the Mist: The Early Epochs of Category Learning , 1998 .

[3]  Jonathan D. Nelson,et al.  Information search with situation-specific reward functions , 2012, Judgment and Decision Making.

[4]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization. , 1991 .

[5]  Jonathan D. Nelson Finding useful questions: on Bayesian diagnosticity, probability, impact, and information gain. , 2005, Psychological review.

[6]  Aaron B. Hoffman,et al.  Eyetracking and selective attention in category learning , 2005, Cognitive Psychology.

[7]  K. Holyoak,et al.  Induction of category distributions: a framework for classification learning. , 1984, Journal of experimental psychology. Learning, memory, and cognition.

[8]  Norma J Stringham Experience matters , 2009, RN.

[9]  James L. McClelland,et al.  The Morton-Massaro law of information integration: implications for models of perception. , 2001, Psychological review.

[10]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[11]  D. Medin,et al.  Linear separability in classification learning. , 1981 .

[12]  Douglas L. Medin,et al.  Context theory of classification learning. , 1978 .

[13]  York Hagmayer,et al.  A transitivity heuristic of probabilistic causal reasoning , 2009 .

[14]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[15]  L. Elton,et al.  THE DIRECTION OF TIME , 1978 .

[16]  Russell C. Burnett,et al.  Feature inference and the causal structure of categories , 2005, Cognitive Psychology.

[17]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[18]  M. Gluck,et al.  Probabilistic classification learning in amnesia. , 1994, Learning & memory.

[19]  Stephen M. Kosslyn,et al.  Memory and mind : a festschrift for Gordon H. Bower , 2007 .

[20]  R. Hertwig,et al.  Decisions from Experience and the Effect of Rare Events in Risky Choice , 2004, Psychological science.

[21]  Ralf Mayrhofer,et al.  Agents and Causes: A Bayesian Error Attribution Model of Causal Reasoning , 2010 .