Bayesian Networks

Probabilistic models based on directed acyclic graphs (DAGs) have a long and rich tradition, which began with the geneticist Sewall Wright (1921). Variants have apeared in many elds; within cognitive science and arti cial intelligence, such models are known as Bayesian networks. Their initial development in the late 1970s was motivated by the need to model the top-down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirectional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems, replacing earlier, ad hoc rule-based schemes [Pearl, 1988, Shafer and Pearl, 1990, Heckerman et al., 1995, Jensen, 1996]. The nodes in a Bayesian network represent propositional variables of interest (e.g., the temperature of a device, the gender of a patient, a feature of an object, the occurrence of an event) and the links represent informational or causal dependencies among the variables. The dependencies are quanti ed by conditional probabilities for each node given its parents in the network. The network supports the computation of the probabilities of any subset of variables given evidence about any other subset. Figure 1 illustrates a simple yet typical Bayesian network. It describes the causal relationships among the season of the year (X1), whether it's raining (X2), whether the sprinkler is on (X3), whether the pavement is wet (X4), and whether the pavement is slippery (X5). Here, the absence of a direct link between X1 and X5, for example, captures our understanding that there is no direct in uence of season on slipperiness|the in uence is mediated by the wetness of the pavement. (If freezing is a possibility, then a direct link could be added.)

[1]  Judea Pearl,et al.  Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach , 1982, AAAI.

[2]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[3]  Thomas L. Griffiths,et al.  Structure Learning in Human Causal Induction , 2000, NIPS.

[4]  Michael P. Wellman,et al.  Real-world applications of Bayesian networks , 1995, CACM.

[5]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[6]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[7]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[8]  Nevin Lianwen Zhang,et al.  Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..

[9]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[10]  Judea Pearl,et al.  Evidential Reasoning Using Stochastic Simulation of Causal Models , 1987, Artif. Intell..

[11]  Judea Pearl,et al.  CAUSATION, ACTION, AND COUNTERFACTUALS , 2004 .

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[13]  Judea Pearl,et al.  Qualitative Probabilities for Default Reasoning, Belief Revision, and Causal Modeling , 1996, Artif. Intell..

[14]  Judea Pearl,et al.  Counterfactuals and Policy Analysis in Structural Models , 1995, UAI.

[15]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[16]  Joseph Y. Halpern An Analysis of First-Order Logics of Probability , 1989, IJCAI.

[17]  S. Lauritzen The EM algorithm for graphical association models with missing data , 1995 .

[18]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.