Incorporating expert knowledge when learning Bayesian network structure: A medical case study

OBJECTIVES Bayesian networks (BNs) are rapidly becoming a leading technology in applied Artificial Intelligence, with many applications in medicine. Both automated learning of BNs and expert elicitation have been used to build these networks, but the potentially more useful combination of these two methods remains underexplored. In this paper we examine a number of approaches to their combination when learning structure and present new techniques for assessing their results. METHODS AND MATERIALS Using public-domain medical data, we run an automated causal discovery system, CaMML, which allows the incorporation of multiple kinds of prior expert knowledge into its search, to test and compare unbiased discovery with discovery biased with different kinds of expert opinion. We use adjacency matrices enhanced with numerical and colour labels to assist with the interpretation of the results. We present an algorithm for generating a single BN from a set of learned BNs that incorporates user preferences regarding complexity vs completeness. These techniques are presented as part of the first detailed workflow for hybrid structure learning within the broader knowledge engineering process. RESULTS The detailed knowledge engineering workflow is shown to be useful for structuring a complex iterative BN development process. The adjacency matrices make it clear that for our medical case study using the IOWA dataset, the simplest kind of prior information (partially sorting variables into tiers) was more effective in aiding model discovery than either using no prior information or using more sophisticated and detailed expert priors. The method for generating a single BN captures relationships that would be overlooked by other approaches in the literature. CONCLUSION Hybrid causal learning of BNs is an important emerging technology. We present methods for incorporating it into the knowledge engineering process, including visualisation and analysis of the learned networks.

[1]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[2]  Kevin B. Korb,et al.  Learning Bayesian Networks with Restricted Causal Interactions , 1999, UAI.

[3]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence , 2004, Computer science and data analysis series.

[4]  S. Silver,et al.  Heart Failure , 1937, The New England journal of medicine.

[5]  Kevin B. Korb,et al.  Parameterisation and evaluation of a Bayesian network for use in an ecological risk assessment , 2007, Environ. Model. Softw..

[6]  Arno Siebes,et al.  Priors on network structures. Biasing the search for Bayesian networks , 1998, Int. J. Approx. Reason..

[7]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[8]  Kevin B. Korb,et al.  A Bayesian Metric for Evaluating Machine Learning Algorithms , 2004, Australian Conference on Artificial Intelligence.

[9]  Kevin B. Korb,et al.  The Evolution of Causal Models: A Comparison of Bayesian Metrics and Structure Priors , 1999, PAKDD.

[10]  John F. Roddick,et al.  Guest Editors' Introduction to Special Issue on Health Data Mining , 2006 .

[11]  Steffen L. Lauritzen,et al.  Bayesian updating in causal probabilistic networks by local computations , 1990 .

[12]  Kevin B. Korb,et al.  Causal Discovery with Prior Information , 2006, Australian Conference on Artificial Intelligence.

[13]  Hugh Tunstall-Pedoe,et al.  Symptomatic and asymptomatic left-ventricular systolic dysfunction in an urban population , 1997, The Lancet.

[14]  M. Knuiman,et al.  Multivariate risk estimation for coronary heart disease: the Busselton Health Study , 1998, Australian and New Zealand journal of public health.

[15]  Robert B. Wallace,et al.  Established Populations for Epidemiologic Studies of the Elderly, 1981-1993: [East Boston, Massachusetts, Iowa and Washington Counties, Iowa, New Haven, Connecticut, and North Central North Carolina] , 1993 .

[16]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[17]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[18]  Joe Suzuki,et al.  Learning Bayesian Belief Networks Based on the Minimum Description Length Principle: An Efficient Algorithm Using the B & B Technique , 1996, ICML.

[19]  Andrew P. Hodges,et al.  Bayesian Network Expansion Identifies New ROS and Biofilm Regulators , 2010, PloS one.

[20]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[21]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[22]  Alexander Gammerman,et al.  Causal Models and Intelligent Data Management , 1999, Springer Berlin Heidelberg.

[23]  Thomas D. Nielsen,et al.  Latent variable discovery in classification models , 2004, Artif. Intell. Medicine.

[24]  Ann E. Nicholson,et al.  Matilda: A visual tool for modeling with Bayesian networks , 2006, Int. J. Intell. Syst..

[25]  S B Hulley,et al.  Overall and coronary heart disease mortality rates in relation to major risk factors in 325,348 men screened for the MRFIT. Multiple Risk Factor Intervention Trial. , 1986, American heart journal.

[26]  D. Heckerman,et al.  ,81. Introduction , 2022 .

[27]  Boaz Lerner,et al.  Bayesian Network Structure Learning by Recursive Autonomy Identification , 2006, SSPR/SPR.

[28]  Kevin B. Korb,et al.  Bayesian Artificial Intelligence, Second Edition , 2010 .

[29]  Silja Renooij,et al.  Probabilities for a probabilistic network: a case study in oesophageal cancer , 2002, Artif. Intell. Medicine.

[30]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[31]  C. S. Wallace,et al.  Statistical and Inductive Inference by Minimum Message Length (Information Science and Statistics) , 2005 .

[32]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[33]  Murray Turoff,et al.  The Delphi Method: Techniques and Applications , 1976 .

[34]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[35]  Gregory M. Provan,et al.  Knowledge Engineering for Large Belief Networks , 1994, UAI.

[36]  C. S. Wallace,et al.  Learning Linear Causal Models by MML Sampling , 1999 .

[37]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[38]  Ross D. Shachter,et al.  A Bayesian network for mammography , 2000, AMIA.

[39]  Kevin B. Korb,et al.  Epidemiological data mining of cardiovascular Bayesian networks , 2006 .

[40]  G R Sutherland,et al.  Assessing diagnosis in heart failure: which features are any use? , 1997, QJM : monthly journal of the Association of Physicians.

[41]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[42]  Lise Getoor,et al.  Understanding tuberculosis epidemiology using structured statistical models , 2004, Artif. Intell. Medicine.

[43]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[44]  Luis M. de Campos,et al.  A comparison of learning algorithms for Bayesian networks: a case study based on data from an emergency medical service , 2004, Artif. Intell. Medicine.

[45]  T. Valle,et al.  Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. , 2001, The New England journal of medicine.

[46]  Charles Twardy,et al.  Decision Support for Clinical Cardiovascular Risk Assessment , 2008 .

[47]  Xindong Wu,et al.  A Study of Causal Discovery With Weak Links and Small Samples , 1997, IJCAI.

[48]  Bart De Moor,et al.  Using literature and data to learn Bayesian networks as clinical models of ovarian tumors , 2004, Artif. Intell. Medicine.