Gene Pathways Discovery in Asbestos-Related Diseases using Local Causal Discovery Algorithm

To learn about the progression of a complex disease, it is necessary to understand the physiology and function of many genes operating together in distinct interactions as a system. In order to significantly advance our understanding of the function of a system, we need to learn the causal relationships among its modeled genes. To this end, it is desirable to compare experiments of the system under complete interventions of some genes, e.g., knock-out of some genes, with experiments of the system without interventions. However, it is expensive and difficult (if not impossible) to conduct wet lab experiments of complete interventions of genes in animal models, e.g., a mouse model. Thus, it will be helpful if we can discover promising causal relationships among genes with observational data alone in order to identify promising genes to perturb in the system that can later be verified in wet laboratories. While causal Bayesian networks have been actively used in discovering gene pathways, most of the algorithms that discover pairwise causal relationships from observational data alone identify only a small number of significant pairwise causal relationships, even with a large dataset. In this article, we introduce new causal discovery algorithms—the Equivalence Local Implicit latent variable scoring Method (EquLIM) and EquLIM with Markov chain Monte Carlo search algorithm (EquLIM-MCMC)—that identify promising causal relationships even with a small observational dataset.

[1]  Gregory F. Cooper,et al.  A Bayesian Method for Causal Modeling and Discovery Under Selection , 2000, UAI.

[2]  Matti Nykter,et al.  Simulation of microarray data with realistic characteristics , 2006, BMC Bioinformatics.

[3]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[4]  Nir Friedman,et al.  Being Bayesian about Network Structure , 2000, UAI.

[5]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[6]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[7]  Changwon Yoo,et al.  The Bayesian method for causal discovery of latent-variable models from a mixture of experimental and observational data , 2012, Comput. Stat. Data Anal..

[8]  Dennis J. Michaud,et al.  eXPatGen: Generating Dynamic Expression Patterns for the Systematic Evaluation of Analytical Methods , 2003, Bioinform..

[9]  D. Husmeier,et al.  Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge , 2007, Statistical applications in genetics and molecular biology.

[10]  D H Bowden,et al.  Response of mouse lung to crocidolite asbestos. 2. Pulmonary fibrosis after long fibres , 1987, The Journal of pathology.

[11]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[12]  Marco Grzegorczyk,et al.  Modelling non-stationary gene regulatory processes with a non-homogeneous Bayesian network and the allocation sampler , 2008, Bioinform..

[13]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[14]  Gregory F. Cooper,et al.  Discovery of gene-regulation pathways using local causal search , 2002, AMIA.

[15]  Erik M. Brilz,et al.  The Five‐Gene‐Network Data Analysis with Local Causal Discovery Algorithm Using Causal Bayesian Networks , 2009, Annals of the New York Academy of Sciences.

[16]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[17]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[18]  Daphne Koller,et al.  Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.

[19]  J. Testa,et al.  Asbestos, chromosomal deletions, and tumor suppressor gene alterations in human malignant mesothelioma , 1999, Journal of cellular physiology.

[20]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[21]  J. Trent,et al.  Microarrays and toxicology: The advent of toxicogenomics , 1999, Molecular carcinogenesis.

[22]  Kouichi Takahashi,et al.  Multi-algorithm and multi-timescale cell biology simulation , 2004 .

[23]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[24]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[25]  L. Glass,et al.  Combinatorial explosion in model gene networks. , 2000, Chaos.

[26]  Masaru Tomita,et al.  A multi-algorithm, multi-timescale method for cell simulation , 2004, Bioinform..