Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

In complex diseases, various combinations of genomic perturbations often lead to the same phenotype. On a molecular level, combinations of genomic perturbations are assumed to dys-regulate the same cellular pathways. Such a pathway-centric perspective is fundamental to understanding the mechanisms of complex diseases and the identification of potential drug targets. In order to provide an integrated perspective on complex disease mechanisms, we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. First, we identified a representative set of genes that are differentially expressed in cancer compared to non-tumor control cases. Assuming that disease-associated gene expression changes are caused by genomic alterations, we determined potential paths from such genomic causes to target genes through a network of molecular interactions. Applying our method to sets of genomic alterations and gene expression profiles of 158 Glioblastoma multiforme (GBM) patients we uncovered candidate causal genes and causal paths that are potentially responsible for the altered expression of disease genes. We discovered a set of putative causal genes that potentially play a role in the disease. Combining an expression Quantitative Trait Loci (eQTL) analysis with pathway information, our approach allowed us not only to identify potential causal genes but also to find intermediate nodes and pathways mediating the information flow between causal and target genes. Our results indicate that different genomic perturbations indeed dys-regulate the same functional pathways, supporting a pathway-centric perspective of cancer. While copy number alterations and gene expression data of glioblastoma patients provided opportunities to test our approach, our method can be applied to any disease system where genetic variations play a fundamental causal role.

[1]  D. Koller,et al.  Population genomics of human gene expression , 2007, Nature Genetics.

[2]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[3]  Doheon Lee,et al.  Inferring Pathway Activity toward Precise Disease Classification , 2008, PLoS Comput. Biol..

[4]  Tony Pawson,et al.  NetworKIN: a resource for exploring cellular phosphorylation networks , 2007, Nucleic Acids Res..

[5]  Salim A. Chowdhury,et al.  IDENTIFICATION OF COORDINATELY DYSREGULATED SUBNETWORKS IN COMPLEX PHENOTYPES by SALIM , 2010 .

[6]  Jih-Luh Tang,et al.  Characterization of CEBPA Mutations in Acute Myeloid Leukemia: Most Patients with CEBPA Mutations Have Biallelic Mutations and Show a Distinct Immunophenotype of the Leukemic Cells , 2005, Clinical Cancer Research.

[7]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[8]  Elizabeth Iorns,et al.  Integrated Functional, Gene Expression and Genomic Analysis for the Identification of Cancer Targets , 2009, PloS one.

[9]  P. Bork,et al.  Systematic Discovery of In Vivo Phosphorylation Networks , 2007, Cell.

[10]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[11]  A. Bosserhoff,et al.  The role of versican isoforms V0/V1 in glioma migration mediated by transforming growth factor-β2 , 2007, British Journal of Cancer.

[12]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[13]  D. Karger,et al.  Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity , 2009, Nature Genetics.

[14]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[15]  Francesco Hofmann,et al.  PI3K/PTEN/Akt pathway status affects the sensitivity of high-grade glioma cell cultures to the insulin-like growth factor-1 receptor inhibitor NVP-AEW541. , 2010, Neuro-oncology.

[16]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[17]  Yoo-Ah Kim,et al.  Modeling information flow in biological networks , 2011, Physical biology.

[18]  Michael Platten,et al.  Glioma Cell Invasion: Regulation of Metalloproteinase Activity by TGF-β , 2001, Journal of Neuro-Oncology.

[19]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[20]  E. Wagner,et al.  A strain‐independent postnatal neurodegeneration in mice lacking the EGF receptor , 1998, The EMBO journal.

[21]  M. Moran,et al.  Large-scale mapping of human protein–protein interactions by mass spectrometry , 2007, Molecular systems biology.

[22]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[23]  O D Laerum,et al.  Effect of epidermal growth factor on glioma cell growth, migration, and invasion in vitro. , 1990, Cancer research.

[24]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[25]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[26]  Xun Qu,et al.  A novel tumor suppressor gene RhoBTB2 (DBC2): Frequent loss of expression in sporadic breast cancer , 2010, Molecular carcinogenesis.

[27]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[28]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[29]  Yoon-La Choi,et al.  Identification of candidate prostate cancer genes through comparative expression‐profiling of seminal vesicle , 2008, The Prostate.

[30]  Lajos Pusztai,et al.  Gene expression profiling of breast cancer , 2009, Breast Cancer Research.

[31]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2007, Nucleic Acids Res..

[32]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[33]  Mark E. J. Newman A measure of betweenness centrality based on random walks , 2005, Soc. Networks.

[34]  Li Wang,et al.  An integrative approach for causal gene identification and gene regulatory pathway inference , 2006, ISMB.

[35]  Michael Q. Zhang,et al.  TRED: a transcriptional regulatory element database, new entries and other development , 2007, Nucleic Acids Res..

[36]  Kesheng Liu,et al.  Information Flow Analysis of Interactome Networks , 2009, PLoS Comput. Biol..

[37]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Yuri Kotliarov,et al.  Unsupervised analysis of transcriptomic profiles reveals six glioma subtypes. , 2009, Cancer research.

[39]  Christina Backes,et al.  A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis , 2009, Bioinform..

[40]  Yuri Kotliarov,et al.  High-resolution global genomic survey of 178 gliomas reveals novel regions of copy number alteration and allelic imbalances. , 2006, Cancer research.

[41]  Eric E. Schadt,et al.  Moving toward a system genetics view of disease , 2007, Mammalian Genome.

[42]  Richard M. Karp,et al.  Detecting Disease-Specific Dysregulated Pathways Via Analysis of Clinical Expression Profiles , 2008, RECOMB.

[43]  Yonina C. Eldar,et al.  eQED: an efficient method for interpreting eQTL associations using protein networks , 2008, Molecular systems biology.

[44]  J. Thierry-Mieg,et al.  AceView: a comprehensive cDNA-supported gene and transcripts annotation , 2006, Genome Biology.

[45]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[46]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[47]  Allegra Via,et al.  Phospho.ELM: a database of phosphorylation sites—update 2008 , 2008, Nucleic Acids Res..

[48]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[49]  Roded Sharan,et al.  A Propagation-based Algorithm for Inferring Gene-Disease Assocations , 2008, German Conference on Bioinformatics.

[50]  C. James,et al.  GBAS, a novel gene encoding a protein with tyrosine phosphorylation sites and a transmembrane domain, is co-amplified with EGFR. , 1998, Genomics.

[51]  Fiona M. Watt,et al.  The EGF Receptor Provides an Essential Survival Signal for SOS-Dependent Skin Tumor Development , 2000, Cell.

[52]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[53]  Alejandra Bruna,et al.  High TGFbeta-Smad activity confers poor prognosis in glioma patients and promotes cell proliferation depending on the methylation of the PDGF-B gene. , 2007, Cancer cell.

[54]  Dipanwita Roy Chowdhury,et al.  Human protein reference database as a discovery resource for proteomics , 2004, Nucleic Acids Res..

[55]  Dianne P. O'Leary,et al.  Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality , 2008, PLoS Comput. Biol..

[56]  Ajay N. Jain,et al.  Hidden Markov models approach to the analysis of array CGH data , 2004 .

[57]  H. Piwnica-Worms,et al.  Inactivation of the p34cdc2-cyclin B complex by the human WEE1 tyrosine kinase. , 1992, Science.

[58]  R. Rodenburg,et al.  Sequence variants in four candidate genes (NIPSNAP1, GBAS, CHCHD1 and METT11D1) in patients with combined oxidative phosphorylation system deficiencies , 2010, Journal of Inherited Metabolic Disease.

[59]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[60]  Steven C. Lawlor,et al.  MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data , 2003, Genome Biology.