An integrative modular approach to systematically predict gene-phenotype associations

BackgroundComplex human diseases are often caused by multiple mutations, each of which contributes only a minor effect to the disease phenotype. To study the basis for these complex phenotypes, we developed a network-based approach to identify coexpression modules specifically activated in particular phenotypes. We integrated these modules, protein-protein interaction data, Gene Ontology annotations, and our database of gene-phenotype associations derived from literature to predict novel human gene-phenotype associations. Our systematic predictions provide us with the opportunity to perform a global analysis of human gene pleiotropy and its underlying regulatory mechanisms.ResultsWe applied this method to 338 microarray datasets, covering 178 phenotype classes, and identified 193,145 phenotype-specific coexpression modules. We trained random forest classifiers for each phenotype and predicted a total of 6,558 gene-phenotype associations. We showed that 40.9% genes are pleiotropic, highlighting that pleiotropy is more prevalent than previously expected. We collected 77 ChIP-chip datasets studying 69 transcription factors binding over 16,000 targets under various phenotypic conditions. Utilizing this unique data source, we confirmed that dynamic transcriptional regulation is an important force driving the formation of phenotype specific gene modules.ConclusionWe created a genome-wide gene to phenotype mapping that has many potential implications, including providing potential new drug targets and uncovering the basis for human disease phenotypes. Our analysis of these phenotype-specific coexpression modules reveals a high prevalence of gene pleiotropy, and suggests that phenotype-specific transcription factor binding may contribute to phenotypic diversity. All resources from our study are made freely available on our online Phenotype Prediction Database [1].

[1]  LiHaifeng,et al.  Systematic discovery of functional modules and context-specific functional annotation of human genome , 2007 .

[2]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[3]  A. Butte,et al.  Creation and implications of a phenome-genome network , 2006, Nature Biotechnology.

[4]  Hans Kresse,et al.  Differential expression of the small chondroitin/dermatan sulfate proteoglycans decorin and biglycan after injury of the adult rat brain , 1995, Brain Research.

[5]  Rosario M. Piro,et al.  Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis , 2008, PLoS Comput. Biol..

[6]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[7]  A. Pellicer,et al.  Cytokine pleiotropy and redundancy--gp130 cytokines in human implantation. , 1999, Immunology today.

[8]  Richard G. Jenner,et al.  Coordinated binding of NF-kappaB family members in the response of human cells to lipopolysaccharide. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[9]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[10]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[11]  Michael Karin,et al.  NF-kappaB: linking inflammation and immunity to cancer development and progression. , 2005, Nature reviews. Immunology.

[12]  E. Sage,et al.  A prototypic matricellular protein in the tumor microenvironment—Where there's SPARC, there's fire , 2008, Journal of cellular biochemistry.

[13]  大塚 篤史,et al.  Expression and functional role of β-adrenoceptors in the human urinary bladder urothelium , 2008 .

[14]  D. Clayton,et al.  Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.

[15]  Philip S. Yu,et al.  A graph-based approach to systematically reconstruct human transcriptional regulatory modules , 2007, ISMB/ECCB.

[16]  Mrinal Kalakrishnan,et al.  An Integrative Network Approach to Map the Transcriptome to the Phenome , 2008, RECOMB.

[17]  C. Geczy,et al.  Serum and mucosal S100 proteins, calprotectin (S100A8/S100A9) and S100A12, are elevated at diagnosis in children with inflammatory bowel disease , 2007, Scandinavian journal of gastroenterology.

[18]  H. Müller,et al.  Purification of a Meningeal Cell‐derived Chondroitin Sulphate Proteoglycan with Neurotrophic Activity for Brain Neurons and its Identification as Biglycan , 1995, The European journal of neuroscience.

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[20]  Motowo Nakajima,et al.  Expression of three extracellular matrix degradative enzymes in bladder cancer , 2001, International journal of cancer.

[21]  Elizabeth Pennisi,et al.  Why Do Humans Have So Few Genes? , 2005, Science.

[22]  Alexander Brill,et al.  Platelet‐derived microparticles promote invasiveness of prostate cancer cells via upregulation of MMP‐2 production , 2009, International journal of cancer.

[23]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[24]  Jianzhi Zhang,et al.  Toward a Molecular Understanding of Pleiotropy , 2006, Genetics.

[25]  R. Stern,et al.  The extracellular matrix of the central and peripheral nervous systems: structure and function. , 1988, Journal of neurosurgery.

[26]  G. Ayala,et al.  Reactive stroma in human prostate cancer: induction of myofibroblast phenotype and extracellular matrix remodeling. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.

[27]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[28]  G. Church,et al.  A global view of pleiotropy and phenotypically derived gene function in yeast , 2005, Molecular systems biology.

[29]  Haifeng Li,et al.  Systematic discovery of functional modules and context-specific functional annotation of human genome , 2007, ISMB/ECCB.

[30]  Nicholas Katsanis,et al.  The ciliopathies: an emerging class of human genetic disorders. , 2006, Annual review of genomics and human genetics.

[31]  A. Chapelle,et al.  Mutations in the RNA Component of RNase MRP Cause a Pleiotropic Human Disease, Cartilage-Hair Hypoplasia , 2001, Cell.

[32]  L. Stein,et al.  RNAi analysis of genes expressed in the ovary of Caenorhabditis elegans , 2000, Current Biology.

[33]  Karen Tiede,et al.  Smad4/DPC4-dependent Regulation of Biglycan Gene Expression by Transforming Growth Factor-β in Pancreatic Tumor Cells* , 2002, The Journal of Biological Chemistry.

[34]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[35]  Neda Nategh,et al.  Evidence for dynamically organized modularity in the yeast protein-protein interaction network , 2006 .