Modularity-based credible prediction of disease genes and detection of disease subtypes on the phenotype-gene heterogeneous network

BackgroundProtein-protein interaction networks and phenotype similarity information have been synthesized together to discover novel disease-causing genes. Genetic or phenotypic similarities are manifested as certain modularity properties in a phenotype-gene heterogeneous network consisting of the phenotype-phenotype similarity network, protein-protein interaction network and gene-disease association network. However, the quantitative analysis of modularity in the heterogeneous network and its influence on disease-gene discovery are still unaddressed. Furthermore, the genetic correspondence of the disease subtypes can be identified by marking the genes and phenotypes in the phenotype-gene network. We present a novel network inference method to measure the network modularity, and in particular to suggest the subtypes of diseases based on the heterogeneous network.ResultsBased on a measure which is introduced to evaluate the closeness between two nodes in the phenotype-gene heterogeneous network, we developed a Hitting-Time-based method, CIPHER-HIT, for assessing the modularity of disease gene predictions and credibly prioritizing disease-causing genes, and then identifying the genetic modules corresponding to potential subtypes of the queried phenotype. The CIPHER-HIT is free to rely on any preset parameters. We found that when taking into account the modularity levels, the CIPHER-HIT method can significantly improve the performance of disease gene predictions, which demonstrates modularity is one of the key features for credible inference of disease genes on the phenotype-gene heterogeneous network. By applying the CIPHER-HIT to the subtype analysis of Breast cancer, we found that the prioritized genes can be divided into two sub-modules, one contains the members of the Fanconi anemia gene family, and the other contains a reported protein complex MRE11/RAD50/NBN.ConclusionsThe phenotype-gene heterogeneous network contains abundant information for not only disease genes discovery but also disease subtypes detection. The CIPHER-HIT method presented here is effective for network inference, particularly on credible prediction of disease genes and the subtype analysis of diseases, for example Breast cancer. This method provides a promising way to analyze heterogeneous biological networks, both globally and locally.

[1]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[2]  Jason H. Moore,et al.  Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases , 2008, Human Genetics.

[3]  G. Upton Fisher's Exact Test , 1992 .

[4]  Q. Cui,et al.  Identification of high-quality cancer prognostic markers and metastasis network modules , 2010, Nature communications.

[5]  Susumu Goto,et al.  The commonality of protein interaction networks determined in neurodegenerative disorders (NDDs) , 2007, Bioinform..

[6]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[7]  Maricel G. Kann,et al.  Protein interactions and disease: computational approaches to uncover the etiology of diseases , 2007, Briefings Bioinform..

[8]  R. Bernards,et al.  Enabling personalized cancer medicine through analysis of gene-expression patterns , 2008, Nature.

[9]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[10]  E. Levy-Lahad,et al.  Fanconi anemia and breast cancer susceptibility meet again , 2010, Nature Genetics.

[11]  H. Lehrach,et al.  A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. , 2004, Molecular cell.

[12]  A. Howell,et al.  Origins of breast cancer subtypes and therapeutic implications , 2007, Nature Clinical Practice Oncology.

[13]  Jane Fridlyand,et al.  Differentiation of lobular versus ductal breast carcinomas by expression microarray analysis. , 2003, Cancer research.

[14]  C. Daub,et al.  BMC Systems Biology , 2007 .

[15]  H. Brunner,et al.  From syndrome families to functional genomics , 2004, Nature Reviews Genetics.

[16]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[17]  Christopher Lawrence,et al.  DEGENERATION , 2020, Side Effects May Include Strangers.

[18]  Xuegong Zhang,et al.  Understanding ZHENG in traditional Chinese medicine in the context of neuro-endocrine-immune network. , 2007, IET systems biology.

[19]  M. Iles,et al.  Multi-Variant Pathway Association Analysis Reveals the Importance of Genetic Determinants of Estrogen Metabolism in Breast and Endometrial Cancer Susceptibility , 2010, PLoS genetics.

[20]  H. Wilkinson,et al.  Characterization of a Novel Small Molecule Subtype Specific Estrogen-Related Receptor α Antagonist in MCF-7 Breast Cancer Cells , 2009, PloS one.

[21]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[22]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[23]  Chitta Baral,et al.  Mining Gene-Disease Relationships from Biomedical Literature: Weighting Proteinprotein Interactions and Connectivity , 2006, Pacific Symposium on Biocomputing.

[24]  Yuan Qi,et al.  Modularity and Dynamics of Cellular Networks , 2006, PLoS Comput. Biol..

[25]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[26]  Shiwen Zhao,et al.  Network-Based Relating Pharmacological and Genomic Spaces for Drug Target Identification , 2010, PloS one.

[27]  S. Bortoluzzi,et al.  Disease genes and intracellular protein networks. , 2003, Physiological genomics.

[28]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[29]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[30]  Jagdish Chandra Patra,et al.  Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network , 2010, Bioinform..

[31]  Edwin Wang Cancer Systems Biology , 2010 .

[32]  T. Jiang,et al.  Modularity in the genetic disease‐phenotype network , 2008, FEBS letters.

[33]  Xuebing Wu,et al.  Cancer Gene Prediction Using a Network Approach , 2009 .

[34]  Chen-Yang Shen,et al.  Breast Cancer Risk Is Associated with the Genes Encoding the DNA Double-Strand Break Repair Mre11/Rad50/Nbs1 Complex , 2007, Cancer Epidemiology Biomarkers & Prevention.

[35]  R. Cress,et al.  Descriptive analysis of estrogen receptor (ER)‐negative, progesterone receptor (PR)‐negative, and HER2‐negative invasive breast cancer, the so‐called triple‐negative phenotype , 2007, Cancer.

[36]  A. D’Andrea,et al.  Susceptibility pathways in Fanconi's anemia and breast cancer. , 2010, The New England journal of medicine.

[37]  Jason Y. Liu,et al.  Analysis of protein sequence and interaction data for candidate disease gene prediction , 2006, Nucleic acids research.

[38]  H. Wilkinson,et al.  Estrogen-related receptor-α antagonist inhibits both estrogen receptor–positive and estrogen receptor–negative breast tumor growth in mouse xenografts , 2009, Molecular Cancer Therapeutics.

[39]  Adrian V. Lee,et al.  Estrogen receptor-positive, progesterone receptor-negative breast cancer: association with growth factor receptor expression and tamoxifen resistance. , 2005, Journal of the National Cancer Institute.

[40]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[41]  M. Kibriya,et al.  A CYP19 (aromatase) polymorphism is associated with increased premenopausal breast cancer risk , 2008, Breast Cancer Research and Treatment.

[42]  M. Bani,et al.  Single nucleotide polymorphisms of the aromatase gene (CYP19A1), HER2/neu status, and prognosis in breast cancer patients , 2008, Breast Cancer Research and Treatment.