Knowledge Engineering Cardiovascular Bayesian Networks from the Literature

Bayesian networks are rapidly becoming a tool of choice for applied Artificial Intelligence. There have been many medical applications of BNs however few applying data mining methods to epidemiology. In a previous study we looked at such an application to epidemiological data, specifically assessment of risk for coronary heart disease. In that previous study, we featured two Bayesian networks “knowledgeengineered” from the epidemiology literature, but postponed a detailed discussion of their construction. This report provides the full details of our engineering choices, and the reasons for them. It will interest anyone wishing to replicate our results, or check our assumptions or methods. It may also be of interest to others wishing to make a similar Bayesian network from the epidemiological literature for risk prediction of other medical conditions, as it provides a case study in the steps that need to be undertaken. We used the Bayesian network software package Netica to implement the BNs which generated particular challenges. This report notes specific Netica traps and tricks, which may help others avoid some of the difficulties we encountered. However, the approach described here should extend to any system allowing equations to specify the probability distributions for each node.

[1]  R. Levy,et al.  Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. , 1972, Clinical chemistry.

[2]  David Heckerman,et al.  Probabilistic Interpretation for MYCIN's Certainty Factors , 1990, UAI.

[3]  G. Assmann,et al.  Simple Scoring Scheme for Calculating the Risk of Acute Coronary Events Based on the 10-Year Follow-Up of the Prospective Cardiovascular Münster (PROCAM) Study , 2002, Circulation.

[4]  G. R. Warnick,et al.  Estimating low-density lipoprotein cholesterol by the Friedewald equation is adequate for classifying patients on the basis of nationally recommended cutpoints. , 1990, Clinical chemistry.

[5]  Kevin B. Korb,et al.  Data mining cardiovascular Bayesian networks , 2005 .

[6]  Richard E. Neapolitan,et al.  Probabilistic reasoning in expert systems - theory and algorithms , 2012 .

[7]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence , 2004, Computer science and data analysis series.

[8]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[9]  Glenn Shafer,et al.  Readings in Uncertain Reasoning , 1990 .

[10]  M. Knuiman,et al.  Multivariate risk estimation for coronary heart disease: the Busselton Health Study , 1998, Australian and New Zealand journal of public health.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[12]  C. S. Wallace,et al.  Learning Linear Causal Models by MML Sampling , 1999 .

[13]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..