Predicting diabetes mellitus genes via protein-protein interaction and protein subcellular localization information

BackgroundDiabetes mellitus characterized by hyperglycemia as a result of insufficient production of or reduced sensitivity to insulin poses a growing threat to the health of people. It is a heterogeneous disorder with multiple etiologies consisting of type 1 diabetes, type 2 diabetes, gestational diabetes and so on. Diabetes-associated protein/gene prediction is a key step to understand the cellular mechanisms related to diabetes mellitus. Compared with experimental methods, computational predictions of candidate proteins/genes are cheaper and more effortless. Protein-protein interaction (PPI) data produced by the high-throughput technology have been used to prioritize candidate disease genes/proteins. However, the false interactions in the PPI data seriously hurt computational methods performance. In order to address that particular question, new methods are developed to identify candidate disease genes/proteins via integrating biological data from other sources.ResultsIn this study, a new framework called PDMG is proposed to predict candidate disease genes/proteins. First, the weighted networks are building in terms of the combination of the subcellular localization information and PPI data. To form the weighted networks, the importance of each compartment is evaluated based on the number of interacted proteins in this compartment. This is because the very different roles played by different compartments in cell activities. Besides, some compartments are more important than others. Based on the evaluated compartments, the interactions between proteins are scored and the weighted PPI networks are constructed. Second, the known disease genes are extracted from OMIM database as the seed genes to expand disease-specific networks based on the weighted networks. Third, the weighted values between a protein and its neighbors in the disease-related networks are added together and the sum is as the score of the protein. Last but not least, the proteins are ranked based on descending order of their scores. The candidate proteins in the top are considered to be associated with the diseases and are potential disease-related proteins. Various types of data, such as type 2 diabetes-associated genes, subcellular localizations and protein interactions, are used to test PDMG method.ConclusionsThe results show that the proteins/genes functionally exerting a direct influence over diabetes are consistently placed at the head of the queue. PDMG expands and ranks 445 candidate proteins from the seed set including original 27 type 2 diabetes proteins. Out of the top 27 proteins, 14 proteins are the real type 2 diabetes proteins. The literature extracted from the PubMed database has proved that, out of 13 novel proteins, 8 proteins are associated with diabetes.

[1]  G. Morahan,et al.  Definition of High-Risk Type 1 Diabetes HLA-DR and HLA-DQ Types Using Only Three Single Nucleotide Polymorphisms , 2013, Diabetes.

[2]  T. Byzova,et al.  14‐3‐3β‐Rac1‐p21 activated kinase signaling regulates Akt1‐mediated cytoskeletal organization, lamellipodia formation and fibronectin matrix assembly , 2009, Journal of cellular physiology.

[3]  Y. Wang,et al.  Quantitative candidate gene association studies of metabolic traits in Han Chinese type 2 diabetes patients. , 2015, Genetics and molecular research : GMR.

[4]  M. Daly,et al.  Guilt by association , 2000, Nature Genetics.

[5]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[6]  Robert A. Rizza,et al.  β-Cell Deficit and Increased β-Cell Apoptosis in Humans With Type 2 Diabetes , 2003, Diabetes.

[7]  M. Johnson,et al.  Circulating microRNAs in Sera Correlate with Soluble Biomarkers of Immune Activation but Do Not Predict Mortality in ART Treated Individuals with HIV-1 Infection: A Case Control Study , 2015, PloS one.

[8]  Sean R. Collins,et al.  Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae*S , 2007, Molecular & Cellular Proteomics.

[9]  Bo Hyoung Kim,et al.  AKT1 polymorphisms are associated with risk for metabolic syndrome , 2010, Human Genetics.

[10]  P. Bork,et al.  Dynamic Complex Formation During the Yeast Cell Cycle , 2005, Science.

[11]  J. Marx Unraveling the Causes of Diabetes , 2002, Science.

[12]  Changyu Shen,et al.  Mining Alzheimer Disease Relevant Proteins from Integrated Protein Interactome Data , 2005, Pacific Symposium on Biocomputing.

[13]  E. O’Shea,et al.  Global analysis of protein localization in budding yeast , 2003, Nature.

[14]  Euan A. Adie Speeding Disease Gene Discovery with SUSPECTS , 2005, BMC Bioinformatics.

[15]  David J. Porteous,et al.  Speeding disease gene discovery by sequence based candidate prioritization , 2005, BMC Bioinformatics.

[16]  Yi Pan,et al.  An efficient method to identify essential proteins for different species by integrating protein subcellular localization information , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[17]  David J. Porteous,et al.  SUSPECTS : enabling fast and effective prioritization of positional candidates , 2005 .

[18]  T. Chun,et al.  Erratum to “Characterization of the regulatory roles of the SUMO” , 2012, Diabetes/metabolism research and reviews.

[19]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[20]  J. Svartberg,et al.  Estrogen receptor alpha single nucleotide polymorphism as predictor of diabetes type 2 risk in hypogonadal men , 2013, The aging male : the official journal of the International Society for the Study of the Aging Male.

[21]  K. Gabbay,et al.  A 212-kb region on chromosome 6q25 containing the TAB2 gene is associated with susceptibility to type 1 diabetes. , 2004, Diabetes.

[22]  P. Robinson,et al.  Walking the interactome for prioritization of candidate disease genes. , 2008, American journal of human genetics.

[23]  U. Stelzl,et al.  The value of high quality protein-protein interaction networks for systems biology. , 2006, Current opinion in chemical biology.

[24]  P. O S I T I O N S T A T E M E N T,et al.  Diagnosis and Classification of Diabetes Mellitus , 2011, Diabetes Care.

[25]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[26]  Mehmet Koyutürk,et al.  Disease Gene Prioritization Based on Topological Similarity in Protein-Protein Interaction Networks , 2011, RECOMB.

[27]  J. Todd,et al.  A genome-wide search for human type 1 diabetes susceptibility genes , 1994, Nature.

[28]  L. Rothenberg The causes of diabetes , 1996, The Lancet.

[29]  Mike Tyers,et al.  Evolutionary and Physiological Importance of Hub Proteins , 2006, PLoS Comput. Biol..

[30]  So Ha Ton,et al.  Stress and Its Effects on Glucose Metabolism and 11β-HSD Activities in Rats Fed on a Combination of High-Fat and High-Sucrose Diet with Glycyrrhizic Acid , 2013, Journal of diabetes research.

[31]  Yi Pan,et al.  Predicting Essential Proteins Based on Weighted Degree Centrality , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  T. Chun,et al.  Characterization of the regulatory roles of the SUMO , 2011, Diabetes/metabolism research and reviews.

[33]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[34]  K. Gardner,et al.  Regulation of Nuclear Import/Export of Carbohydrate Response Element-binding Protein (ChREBP) , 2008, Journal of Biological Chemistry.

[35]  H. Venselaar,et al.  Mutations in PCBD1 cause hypomagnesemia and renal magnesium wasting. , 2014, Journal of the American Society of Nephrology : JASN.

[36]  Wan Li,et al.  Prioritizing Disease Candidate Proteins in Cardiomyopathy-Specific Protein-Protein Interaction Networks Based on “Guilt by Association” Analysis , 2013, PloS one.

[37]  J. Nadeau,et al.  Finding Genes That Underlie Complex Traits , 2002, Science.

[38]  M. Loeken Advances in Understanding the Molecular Causes of Diabetes-Induced Birth Defects , 2006, The Journal of the Society for Gynecologic Investigation: JSGI.

[39]  B. Kirdar,et al.  Complex Disease Interventions from a Network Model for Type 2 Diabetes , 2013, PloS one.

[40]  Yi Pan,et al.  Rechecking the Centrality-Lethality Rule in the Scope of Protein Subcellular Localization Interaction Networks , 2015, PloS one.

[41]  S. Angers,et al.  The Identification of Novel Protein-Protein Interactions in Liver that Affect Glucagon Receptor Activity , 2015, PloS one.

[42]  Christian Stolte,et al.  COMPARTMENTS: unification and visualization of protein subcellular localization evidence , 2014, Database J. Biol. Databases Curation.

[43]  Masayuki Yamamoto,et al.  Transcription factor NF‐E2‐related factor 1 impairs glucose metabolism in mice , 2014, Genes to cells : devoted to molecular & cellular mechanisms.

[44]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[45]  N. Hübner,et al.  Recessive Mutations in PCBD1 Cause a New Type of Early-Onset Diabetes , 2014, Diabetes.

[46]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[47]  Chao Wu,et al.  Integrating gene expression and protein-protein interaction network to prioritize cancer-associated genes , 2012, BMC Bioinformatics.

[48]  Jing Chen,et al.  ToppGene Suite for gene list enrichment analysis and candidate gene prioritization , 2009, Nucleic Acids Res..

[49]  Bassem A. Hassan,et al.  Gene prioritization through genomic data fusion , 2006, Nature Biotechnology.

[50]  Frances S. Turner,et al.  POCUS: mining genomic sequence annotation to predict disease genes , 2003, Genome Biology.

[51]  Grant Morahan,et al.  The Affymetrix DMET Plus Platform Reveals Unique Distribution of ADME-Related Variants in Ethnic Arabs , 2015, Disease markers.

[52]  G. Guo,et al.  Association of the HLA-DQA1 and HLA-DQB1 Alleles in Type 2 Diabetes Mellitus and Diabetic Nephropathy in the Han Ethnicity of China , 2013, Journal of diabetes research.

[53]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[54]  Buhm Han,et al.  Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk , 2015, Nature Genetics.

[55]  Jian-Kang Chen,et al.  EGF receptor deletion in podocytes attenuates diabetic nephropathy. , 2015, Journal of the American Society of Nephrology : JASN.

[56]  M. Andersen,et al.  CNC-bZIP protein Nrf1-dependent regulation of glucose-stimulated insulin secretion. , 2015, Antioxidants & redox signaling.

[57]  Jingchun Chen,et al.  Detecting functional modules in the yeast protein-protein interaction network , 2006, Bioinform..

[58]  D. Coustan,et al.  Gestational diabetes mellitus. , 2013, Clinical chemistry.

[59]  H. Haghir,et al.  Effects of streptozotocin-induced type 1 maternal diabetes on PI3K/AKT signaling pathway in the hippocampus of rat neonates , 2016, Journal of receptor and signal transduction research.

[60]  Jiang-feng Xu,et al.  Reduced Histone H3 Acetylation in CD4+ T Lymphocytes: Potential Mechanism of Latent Autoimmune Diabetes in Adults , 2015, Disease markers.

[61]  Sumio Takahashi,et al.  Insulin-like Growth Factor 1 mRNA Expression in the Uterus of Streptozotocin-treated Diabetic Mice , 2013, The Journal of reproduction and development.

[62]  Luca Benini,et al.  TOM: enhancement and extension of a tool suite for in silico approaches to multigenic hereditary disorders , 2008, Bioinform..

[63]  D. Threadgill,et al.  EGFR signaling promotes TGFβ-dependent renal fibrosis. , 2012, Journal of the American Society of Nephrology : JASN.