Incorporating prior expert knowledge in learning Bayesian networks from genetic epidemiological data

We consider the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. Most state-of-the-art BN structure learning algorithms are not capable of learning structures from data containing missing values, which is a norm in genetic epidemiological data. In addition, there exists a wealth of existing prior knowledge which could be incorporated to improve computational efficiency in BN structure learning. To address these challenges, we applied a Markov chain Monte Carlo based BN structure learning algorithm to data from a population-based study of bladder cancer in New Hampshire, USA. A large improvement in computational efficiency is achieved under this approach.

[1]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[2]  Xujing Wang,et al.  Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data , 2011, BMC Bioinformatics.

[3]  Kathryn B. Laskey,et al.  Learning Bayesian networks from incomplete data using evolutionary algorithms , 1999 .

[4]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[5]  Margaret R. Karagas,et al.  Incidence of Transitional Cell Carcinoma of the Bladder and Arsenic Exposure in New Hampshire , 2004, Cancer Causes & Control.

[6]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[7]  K. N. Dollman,et al.  - 1 , 1743 .

[8]  D. Silverman,et al.  Occupation and Bladder Cancer Risk in a Population-based Case-control Study in New Hampshire , 2004, Cancer Causes & Control.

[9]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[10]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[11]  Jian Gu,et al.  HSD3B and Gene-Gene Interactions in a Pathway-Based Analysis of Genetic Susceptibility to Bladder Cancer , 2012, PloS one.

[13]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[14]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[15]  T. Tosteson,et al.  Design of an epidemiologic study of drinking water arsenic exposure and skin and bladder cancer risk in a U.S. population. , 1998, Environmental health perspectives.

[16]  Pauline C Ng,et al.  An agenda for personalized medicine. , 2009, Nature.

[17]  Mark E. Borsuk,et al.  Using Bayesian networks to discover relations between genes, environment, and disease , 2013, BioData Mining.

[18]  M. Karagas,et al.  Bladder cancer risk and personal hair dye use , 2004, International journal of cancer.

[19]  G. Ginsburg,et al.  The path to personalized medicine. , 2002, Current opinion in chemical biology.