The Assumption of Class-Conditional Independence in Category Learning Jana Jarecki (jarecki@mpib-berlin.mpg.de) ∗ Bj¨orn Meder (meder@mpib-berlin.mpg.de) ∗ Jonathan D. Nelson (nelson@mpib-berlin.mpg.de) ∗ ∗ Center for Adaptive Cognition and Behavior (ABC), Max Planck Institute for Human Development, Lentzeallee 94 14195 Berlin, Germany Abstract in Reichenbach’s (1956) common-cause principle in the phi- losophy of science and in causal modeling (Spirtes, Glymour, & Scheines, 1993; Pearl, 2000). Both the philosophical and psychological literature make claims about the normative bases of the assumption of class- conditional-independence of features. Our focus here is not on the general normativity or nonnormativity of that assump- tion, but on whether the assumption of class-conditional inde- pendence may (perhaps tacitly) underlie people’s inferences in learning and multiple-cue categorization tasks. We think of this assumption as one of many possible default (heuris- tic or meta-heuristic) assumptions that, if close enough to an environment’s actual structure, may facilitate learning and in- ferences. This paper investigates the role of the assumption of class- conditional independence of object features in human classi- fication learning. This assumption holds that object feature values are statistically independent of each other, given knowl- edge of the object’s true category. Treating features as class- conditionally independent can in many situations substantially facilitate learning and categorization even if the assumption is not perfectly true. Using optimal experimental design princi- ples, we designed a task to test whether people have this de- fault assumption when learning to categorize. Results provide some supporting evidence, although the data are mixed. What is clear is that classification behavior adapts to the structure of the environment: a category structure that is unlearnable under the assumption of class-conditional independence is learned by all participants. Keywords: Multiple-cue classification learning; class- conditional independence; na¨ive Bayes; causal Markov con- dition The Psychology of Conditional Independence Introduction Categorization is fundamental for cognition. Grouping to- gether objects or events helps us to efficiently encode envi- ronmental patterns, make inferences about unobserved prop- erties of novel instances, and make decisions. Without cate- gorization we could not see the woods for the trees. Despite the ease with which we form categories and use them to make inferences or judgments, from a computational perspective categorization is a challenging problem. For in- stance, different diseases can cause similar symptoms, en- tailing that diagnostic inferences are often only probabilistic. Patients may have new symptom combinations and still re- quire a diagnosis. Depending on the specific assumptions the physician makes about the relationship between the diseases and symptoms, a physician could justifiably make very dif- ferent inferences about the diseases. In the present paper, we investigate the role of the possi- ble assumption of class-conditional independence of features in category learning. Class-conditional independence holds if the features of the category members are statistically indepen- dent given the true class. This assumption can facilitate clas- sification and learning of category structures. The concept of class-conditional independence underlies the na¨ive Bayes classifier in machine learning (Domingos & Pazzani, 1997), and is also a key assumption in some psychological classifica- tion models (e.g., Fried & Holyoak, 1984; Anderson, 1991). It is related to ideas of channel separability in sensory percep- tion (Movellan & McClelland, 2001). Similar ideas are found Some psychological models of categorization incorporate as- sumptions of class-conditional independence, such as the cat- egory density model (Fried & Holyoak, 1984) or Anderson’s (1991) rational model of categorization. Both models treat features of instances as class-conditionally independent to make inferences about category membership or unobserved item properties. Other research has focused more directly on the role of conditional independence assumptions in human reasoning. For instance, a key assumption in many formal causal mod- eling approaches (e.g., Pearl, 2000; Spirtes et al., 1993) is the so-called causal Markov condition, which assumes that a variable in a causal network is independent of all other vari- ables (except for its causal descendants), conditional on its di- rect causes. As this assumption facilitates probabilistic infer- ences across complex causal networks it was suggested that people’s causal inferences could also comply with this condi- tional independence assumption. Von Sydow, Meder, and Hagmayer (2009) investigated reasoning about causal chains and found that subjects’ infer- ences indicated a use of conditional independence assump- tions, even if the learning data suggested otherwise. 1 Other research, however, found violations of the causal Markov condition (Rehder & Burnett, 2005). Asked to infer the prob- 1 For instance, applying the causal Markov condition to a causal chain X → Y → Z entails that Z is independent of X given Y (e.g., P(z|y, x) = P(z|y, ¬x).
[1]
Jay I. Myung,et al.
Optimal experimental design for model discrimination.
,
2009,
Psychological review.
[2]
J. D. Smith,et al.
Prototypes in the Mist: The Early Epochs of Category Learning
,
1998
.
[3]
Jonathan D. Nelson,et al.
Information search with situation-specific reward functions
,
2012,
Judgment and Decision Making.
[4]
John R. Anderson,et al.
The Adaptive Nature of Human Categorization.
,
1991
.
[5]
Jonathan D. Nelson.
Finding useful questions: on Bayesian diagnosticity, probability, impact, and information gain.
,
2005,
Psychological review.
[6]
Aaron B. Hoffman,et al.
Eyetracking and selective attention in category learning
,
2005,
Cognitive Psychology.
[7]
K. Holyoak,et al.
Induction of category distributions: a framework for classification learning.
,
1984,
Journal of experimental psychology. Learning, memory, and cognition.
[8]
Norma J Stringham.
Experience matters
,
2009,
RN.
[9]
James L. McClelland,et al.
The Morton-Massaro law of information integration: implications for models of perception.
,
2001,
Psychological review.
[10]
P. Spirtes,et al.
Causation, prediction, and search
,
1993
.
[11]
D. Medin,et al.
Linear separability in classification learning.
,
1981
.
[12]
Douglas L. Medin,et al.
Context theory of classification learning.
,
1978
.
[13]
York Hagmayer,et al.
A transitivity heuristic of probabilistic causal reasoning
,
2009
.
[14]
Pedro M. Domingos,et al.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
,
1997,
Machine Learning.
[15]
L. Elton,et al.
THE DIRECTION OF TIME
,
1978
.
[16]
Russell C. Burnett,et al.
Feature inference and the causal structure of categories
,
2005,
Cognitive Psychology.
[17]
J. Pearl.
Causality: Models, Reasoning and Inference
,
2000
.
[18]
M. Gluck,et al.
Probabilistic classification learning in amnesia.
,
1994,
Learning & memory.
[19]
Stephen M. Kosslyn,et al.
Memory and mind : a festschrift for Gordon H. Bower
,
2007
.
[20]
R. Hertwig,et al.
Decisions from Experience and the Effect of Rare Events in Risky Choice
,
2004,
Psychological science.
[21]
Ralf Mayrhofer,et al.
Agents and Causes: A Bayesian Error Attribution Model of Causal Reasoning
,
2010
.