Causal Inference for Statistics, Social, and Biomedical Sciences: Causality: The Basic Framework

INTRODUCTION In this introductory chapter we set out our basic framework for causal inference. We discuss three key notions underlying our approach. The first notion is that of potential outcomes , each corresponding to one of the levels of a treatment or manipulation , following the dictum “no causation without manipulation” (Rubin, 1975, p. 238). Each of these potential outcomes is a priori observable, in the sense that it could be observed if the unit were to receive the corresponding treatment level. But, a posteriori , that is, once a treatment is applied, at most one potential outcome can be observed. Second, we discuss the necessity, when drawing causal inferences, of observing multiple units , and the utility of the related stability assumption, which we use throughout most of this book to exploit the presence of multiple units. Finally, we discuss the central role of the assignment mechanism , which is crucial for inferring causal effects, and which serves as the organizing principle for this book. POTENTIAL OUTCOMES In everyday life, causal language is widely used in an informal way. One might say: “My headache went away because I took an aspirin,” or “She got a good job last year because she went to college,” or “She has long hair because she is a girl.” Such comments are typically informed by observations on past exposures, for example, of headache outcomes after taking aspirin or not, or of characteristics of jobs of people with or without college educations, or the typical hair length of boys and girls. As such, these observations generally involve informal statistical analyses, drawing conclusions from associations between measurements of different quantities that vary from individual to individual, commonly called variables or random variables – language apparently first used by Yule (1897). Nevertheless, statistical theory has been relatively silent on questions of causality. Many, especially older, textbooks avoid any mention of the term other than in settings of randomized experiments.