K-core decomposition of a protein domain co-occurrence network reveals lower cancer mutation rates for interior cores

BackgroundNetwork biology currently focuses primarily on metabolic pathways, gene regulatory, and protein-protein interaction networks. While these approaches have yielded critical information, alternative methods to network analysis will offer new perspectives on biological information. A little explored area is the interactions between domains that can be captured using domain co-occurrence networks (DCN). A DCN can be used to study the function and interaction of proteins by representing protein domains and their co-existence in genes and by mapping cancer mutations to the individual protein domains to identify signals.ResultsThe domain co-occurrence network was constructed for the human proteome based on PFAM domains in proteins. Highly connected domains in the central cores were identified using the k-core decomposition technique. Here we show that these domains were found to be more evolutionarily conserved than the peripheral domains. The somatic mutations for ovarian, breast and prostate cancer diseases were obtained from the TCGA database. We mapped the somatic mutations to the individual protein domains and the local false discovery rate was used to identify significantly mutated domains in each cancer type. Significantly mutated domains were found to be enriched in cancer disease pathways. However, we found that the inner cores of the DCN did not contain any of the significantly mutated domains. We observed that the inner core protein domains are highly conserved and these domains co-exist in large numbers with other protein domains.ConclusionMutations and domain co-occurrence networks provide a framework for understanding hierarchal designs in protein function from a network perspective. This study provides evidence that a majority of protein domains in the inner core of the DCN have a lower mutation frequency and that protein domains present in the peripheral regions of the k-core contribute more heavily to the disease. These findings may contribute further to drug development.

[1]  Hai Fang,et al.  A disease-drug-phenotype matrix inferred by walking on a functional domain network. , 2013, Molecular bioSystems.

[2]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[3]  S. Wuchty Scale-free behavior in protein domain networks. , 2001, Molecular biology and evolution.

[4]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[5]  Pierre Baldi,et al.  DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks , 2006, Data Mining and Knowledge Discovery.

[6]  Dorothea Emig,et al.  Integrating expression data with domain interaction networks , 2008, Bioinform..

[7]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[8]  Jianlin Cheng,et al.  DOMAC: an accurate, hybrid protein domain prediction server , 2007, Nucleic Acids Res..

[9]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[10]  Alexander Rives,et al.  Modular organization of cellular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Benjamin A. Shoemaker,et al.  Finding biologically relevant protein domain interactions: Conserved binding mode analysis , 2006, Protein science : a publication of the Protein Society.

[12]  Eric S. Lander,et al.  The genomic complexity of primary human prostate cancer , 2010, Nature.

[13]  L. Castagnoli,et al.  Methods to reveal domain networks. , 2005, Drug discovery today.

[14]  G. Parmigiani,et al.  Core Signaling Pathways in Human Pancreatic Cancers Revealed by Global Genomic Analyses , 2008, Science.

[15]  Maria Anisimova,et al.  PANDITplus: toward better integration of evolutionary view on molecular sequences with supplementary bioinformatics resources , 2010 .

[16]  S. Wuchty,et al.  Evolutionary cores of domain co-occurrence networks , 2005, BMC Evolutionary Biology.

[17]  C. Prieto,et al.  Structural domain–domain interactions: Assessment and comparison with protein–protein interaction data to improve the interactome , 2010, Proteins.

[18]  Vladimir Batagelj,et al.  Generalized Cores , 2002, ArXiv.

[19]  A. Sparks,et al.  The Genomic Landscapes of Human Breast and Colorectal Cancers , 2007, Science.

[20]  Dimitrios M. Thilikos,et al.  D-cores: measuring collaboration of directed graphs based on degeneracy , 2011, Knowledge and Information Systems.

[21]  Ka Chen,et al.  PDZ and LIM domain protein 1(PDLIM1)/CLP36 promotes breast cancer cell migration, invasion and metastasis through interaction with α-actinin , 2014, Oncogene.

[22]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[23]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[24]  Stefan Wuchty,et al.  Small worlds in RNA structures. , 2003, Nucleic acids research.

[25]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[26]  Peter Kraft,et al.  Toll-like Receptor Signaling Pathway Variants and Prostate Cancer Mortality , 2009, Cancer Epidemiology Biomarkers & Prevention.

[27]  Zhong-xin Huang,et al.  Correlation Analysis Connects Cancer Subtypes , 2013, PloS one.

[28]  Julien Textoris,et al.  Dysregulation of Ribosome Biogenesis and Translational Capacity Is Associated with Tumor Progression of Human Breast Cancer Cells , 2009, PloS one.

[29]  Arnaud Céol,et al.  3did: identification and classification of domain-based interactions of known three-dimensional structure , 2010, Nucleic Acids Res..

[30]  Giulia Guzzo,et al.  Cancer stem cells from epithelial ovarian cancer patients privilege oxidative phosphorylation, and resist glucose deprivation , 2014, Oncotarget.

[31]  Zev A. Binder,et al.  The Genetic Landscape of the Childhood Cancer Medulloblastoma , 2011, Science.

[32]  Sergiy Butenko,et al.  Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem , 2011, Oper. Res..

[33]  D. Fell,et al.  The small world of metabolism , 2000, Nature Biotechnology.

[34]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[35]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[36]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[37]  J. Davie,et al.  An integrated analysis of genes and pathways exhibiting metabolic differences between estrogen receptor positive breast cancer cells , 2007, BMC Cancer.

[38]  G. Parmigiani,et al.  The Consensus Coding Sequences of Human Breast and Colorectal Cancers , 2006, Science.

[39]  B. Bollobás,et al.  Random Graphs of Small Order , 1985 .

[40]  Petter Holme,et al.  Subnetwork hierarchies of biochemical pathways , 2002, Bioinform..

[41]  Doheon Lee,et al.  Architecture of basic building blocks in protein and domain structural interaction networks , 2005, Bioinform..

[42]  Shuliang Wang,et al.  Data Mining and Knowledge Discovery , 2005, Mathematical Principles of the Internet.

[43]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[44]  Guy Kortsarz,et al.  Generating Sparse 2-Spanners , 1992, J. Algorithms.

[45]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[46]  Kumar Chellapilla,et al.  Finding Dense Subgraphs with Size Bounds , 2009, WAW.

[47]  David F. Gleich,et al.  Algorithms and Models for the Web Graph , 2014, Lecture Notes in Computer Science.

[48]  S. Teichmann,et al.  Domain combinations in archaeal, eubacterial and eukaryotic proteomes. , 2001, Journal of molecular biology.

[49]  D. Fell,et al.  The small world inside large metabolic networks , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[50]  M. Vidal,et al.  Edgetic perturbation models of human inherited disorders , 2009, Molecular systems biology.

[51]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[52]  R. Tibshirani,et al.  Empirical bayes methods and false discovery rates for microarrays , 2002, Genetic epidemiology.

[53]  Brian H. Dunford-Shore,et al.  Somatic mutations affect key pathways in lung adenocarcinoma , 2008, Nature.

[54]  Thomas A. Peterson,et al.  Domain landscapes of somatic mutations in cancer , 2012, BMC Genomics.

[55]  Ming-Jing Hwang,et al.  The architectural design of networks of protein domain architectures , 2013, Biology Letters.

[56]  R. Jaenicke,et al.  Folding and association of proteins. , 1982, Biophysics of structure and mechanism.

[57]  Stefan Wuchty,et al.  Interaction and domain networks of yeast , 2002, Proteomics.

[58]  Laurent Gil,et al.  Ensembl 2013 , 2012, Nucleic Acids Res..