Learning Deterministic Causal Networks from Observational Data Ben Deverett Charles Kemp ben.deverett@mail.mcgill.ca McGill University ckemp@cmu.edu Carnegie Mellon University Abstract networks where the nodes represented population levels of five different species. White’s task proved to be difficult, and performance was poor even when White gave his participants explicit instructions about how to infer causal structure from observational data. Here, however, we demonstrate that both structures considered by White can be reliably learned in the context of the experimental paradigm that we develop. Given that humans perform well on the structure learning tasks that we consider, it is natural to ask how this perfor- mance is achieved. Mayrhofer and Waldmann (2011) pro- pose that learners rely on a “broken link” heuristic and iden- tify the structure that minimizes the number of cases where a cause is present but an effect is absent. They contrast their heuristic-based approach with Bayesian accounts of structure learning that rely on patterns of conditional independence be- tween variables. We propose a Bayesian account that falls in between these two alternatives. Like Mayrhofer and Wald- mann, we believe that models which track patterns of con- ditional independence are often too powerful to capture the inferences made by resource-bounded human learners. Un- like Mayrhofer and Waldmann, we argue that a Bayesian ap- proach is nevertheless useful for explaining why humans suc- ceed in the tasks that we consider. In particular, we show that human inferences are influenced by two factors that are natu- rally captured by the prior and the likelihood of a Bayesian model—a preference for symmetric structures, and a pref- erence for structures that explain the observed data without needing to invoke coincidences. We demonstrate that incor- porating these factors allows a Bayesian model to account for our data better than an approach that relies on the broken-link heuristic alone. Previous work suggests that humans find it difficult to learn the structure of causal systems given observational data alone. We show that structure learning is successful when the causal sys- tems in question are consistent with people’s expectations that causal relationships are deterministic and that each pattern of observations has a single underlying cause. Our data are well explained by a Bayesian model that incorporates a preference for symmetric structures and a preference for structures that make the observed data not only possible but likely. Keywords: structure learning, causal learning, Bayesian mod- eling Causal networks have been widely used as models of the mental representations that support causal reasoning. For ex- ample, an engineer’s knowledge of the local electricity sys- tem may take the form of a network where the nodes rep- resent power stations and the links in the network represent connections between stations. Causal networks of this kind may be learned in several ways. For example, an intervention at station A that also affects station B provides evidence for a directed link between A and B. Networks can also be learned via instruction: for example, a senior colleague might tell the engineer that A sends power to B. Here, however, we focus on whether and how causal networks can be learned from ob- servational data. For example, the engineer might infer that A sends power to B after observing that A and B are both in- active during some blackouts, that B alone is inactive during others, but that A is never the only inactive station. A consensus has emerged that causal structure learning is difficult or impossible given observational data alone. For example, Fernbach and Sloman (2009) cite the results of Steyvers, Tenenbaum, Wagenmakers, and Blum (2003), Lagnado and Sloman (2004), and White (2006) to support their claim that “observation of covariation is insufficient for most participants to recover causal structure” (p 680). Here we join Mayrhofer and Waldmann (2011) in challenging this consensus. We show that people succeed in a structure learn- ing task when the causal systems under consideration are aligned with intuitive expectations about causality. Previous studies suggest that people expect causal relationships to be deterministic (Schulz & Sommerville, 2006; Lu, Yuille, Lilje- holm, Cheng, & Holyoak, 2008), and expect that any pattern of observations tends to be a consequence of a single underly- ing cause (Lombrozo, 2007). We ask people to reason about systems that are consistent with both expectations, and find that structure learning is reliably achieved under these condi- tions. A previous study by White (2006) asked participants to learn the structure of deterministic causal systems from obser- vational data alone. The structures involved were five-node Bayesian Structure Learning The causal systems that we consider are simple activation net- works. Each network can be represented as a graph G which may include cycles. Figure 1a shows one such graph and a data set D generated over the graph. Each row d i in the data set D represents an observed pattern of activation—for exam- ple, the first row represents a case where nodes A, C and D are observed to be active and node B is observed to be inactive. We will assume that each row d i is generated by activating a randomly chosen node then allowing activation to propagate through the network. For example, Figure 1b shows that if A is the randomly activated node, the final pattern of activation will match the first row of matrix D in Figure 1a. The activation networks that we consider have three im- portant properties. First, all causal links are generative, and these generative links combine according to an OR function. For example, node C in Figure 1a will be active if node A is
[1]
Philip M. Fernbach,et al.
Causal learning with local computations.
,
2009,
Journal of experimental psychology. Learning, memory, and cognition.
[2]
Joshua B. Tenenbaum,et al.
Inferring causal networks from observations and interventions
,
2003,
Cogn. Sci..
[3]
A. Yuille,et al.
Bayesian generic priors for causal learning.
,
2008,
Psychological review.
[4]
S. Sloman,et al.
The advantage of timely intervention.
,
2004,
Journal of experimental psychology. Learning, memory, and cognition.
[5]
T. Lombrozo,et al.
Simplicity and probability in causal explanation
,
2007,
Cognitive Psychology.
[6]
Bob Rehder,et al.
A Generative Model of Causal Cycles
,
2011,
CogSci.
[7]
Ralf Mayrhofer,et al.
Heuristics in Covariation-based Induction of Causal Models: Sufficiency and Necessity Priors
,
2011,
CogSci.
[8]
L. Schulz,et al.
God does not play dice: causal determinism and preschoolers' causal inferences.
,
2006,
Child development.
[9]
P. White.
How well is causal structure inferred from cooccurrence information?
,
2006
.