Causal reasoning on biological networks: interpreting transcriptional changes

MOTIVATION The interpretation of high-throughput datasets has remained one of the central challenges of computational biology over the past decade. Furthermore, as the amount of biological knowledge increases, it becomes more and more difficult to integrate this large body of knowledge in a meaningful manner. In this article, we propose a particular solution to both of these challenges. METHODS We integrate available biological knowledge by constructing a network of molecular interactions of a specific kind: causal interactions. The resulting causal graph can be queried to suggest molecular hypotheses that explain the variations observed in a high-throughput gene expression experiment. We show that a simple scoring function can discriminate between a large number of competing molecular hypotheses about the upstream cause of the changes observed in a gene expression profile. We then develop an analytical method for computing the statistical significance of each score. This analytical method also helps assess the effects of random or adversarial noise on the predictive power of our model. RESULTS Our results show that the causal graph we constructed from known biological literature is extremely robust to random noise and to missing or spurious information. We demonstrate the power of our causal reasoning model on two specific examples, one from a cancer dataset and the other from a cardiac hypertrophy experiment. We conclude that causal reasoning models provide a valuable addition to the biologist's toolkit for the interpretation of gene expression data. AVAILABILITY AND IMPLEMENTATION R source code for the method is available upon request.

[1]  Roger R Markwald,et al.  Cardiac fibrosis in mice with hypertrophic cardiomyopathy is mediated by non-myocyte proliferation and requires Tgf-β. , 2010, The Journal of clinical investigation.

[2]  Kim Van der Heiden,et al.  Role of nuclear factor kappaB in cardiovascular health and disease. , 2010, Clinical science.

[3]  H. Garner,et al.  Transcriptional profile of isoproterenol-induced cardiomyopathy and comparison to exercise-induced cardiac hypertrophy and human cardiac failure , 2009, BMC Physiology.

[4]  Shih-Yin Tsai,et al.  Emerging roles of E2Fs in cancer: an exit from cell cycle control , 2009, Nature Reviews Cancer.

[5]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[6]  Friedrich Brunner,et al.  Role of endogenous hydrogen peroxide in cardiovascular ischaemia/reperfusion function: studies in mouse hearts with catalase-overexpression in the vascular endothelium. , 2006, Pharmacological research.

[7]  Jeffrey T. Chang,et al.  Oncogenic pathway signatures in human cancers as a guide to targeted therapies , 2006, Nature.

[8]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[9]  Steve Hoberman,et al.  A computational model to define the molecular causes of type 2 diabetes mellitus. , 2005, Diabetes technology & therapeutics.

[10]  Aaron Aslanian,et al.  Repression of the Arf tumor suppressor by E2F3 is required for normal cell cycle kinetics. , 2004, Genes & development.

[11]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[12]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[13]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[14]  Thomas Lengauer,et al.  Analysis of Gene Expression Data with Pathway Scores , 2000, ISMB.

[15]  M. Marino,et al.  Essential role of tumor necrosis factor alpha (TNF-alpha) in tumor promotion as revealed by TNF-alpha-deficient mice. , 1999, Cancer research.

[16]  Stefano Geuna,et al.  Oxidative stress triggers cardiac fibrosis in the heart of diabetic rats. , 2008, Endocrinology.

[17]  A. Rao,et al.  A Markov chain Monte carol method for generating random (0, 1)-matrices with given marginals , 1996 .