Functional Module Analysis for Gene Coexpression Networks with Network Integration

Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.

[1]  Ashok Balasubramanyam,et al.  The Role of the Immune System in Obesity and Insulin Resistance , 2013, Journal of obesity.

[2]  Jukka-Pekka Onnela,et al.  Community Structure in Time-Dependent, Multiscale, and Multiplex Networks , 2009, Science.

[3]  I. Petersen,et al.  CD24 is an independent prognostic marker of survival in nonsmall cell lung cancer patients , 2003, British Journal of Cancer.

[4]  A. B. Reddy,et al.  Disrupting Rhythms: Diet-Induced Obesity Impairs Diurnal Rhythms in Metabolic Tissues , 2013, Diabetes.

[5]  Jun Dong,et al.  Understanding network concepts in modules , 2007, BMC Systems Biology.

[6]  Anbupalam Thalamuthu,et al.  Gene expression Evaluation and comparison of gene clustering methods in microarray analysis , 2006 .

[7]  James T. Kwok,et al.  Time and space efficient spectral clustering via column sampling , 2011, CVPR 2011.

[8]  Yung-Yu Chuang,et al.  Affinity aggregation for spectral clustering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[10]  Fernando Castro-Chavez,et al.  Coordinated upregulation of oxidative pathways and downregulation of lipid biosynthesis underlie obesity resistance in perilipin knockout mice: a microarray gene expression profile. , 2003, Diabetes.

[11]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[12]  Shigehiko Kanaya,et al.  Development and implementation of an algorithm for detection of protein complexes in large interaction networks , 2006, BMC Bioinformatics.

[13]  Hanina Hibshoosh,et al.  CD24 is a new oncogene, early at the multistep process of colorectal cancer carcinogenesis. , 2006, Gastroenterology.

[14]  Heekyung Chang,et al.  The expression of GAGE gene can predict aggressive biologic behavior of intestinal type of stomach cancer. , 2004, Hepato-gastroenterology.

[15]  C. Pilarsky,et al.  CD24 is expressed in ovarian cancer and is a new independent prognostic marker of patient survival. , 2002, The American journal of pathology.

[16]  D. Koller,et al.  A module map showing conditional activity of expression modules in cancer , 2004, Nature Genetics.

[17]  Hongyu Zhao,et al.  Normalized modularity optimization method for community identification with degree adjustment. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  R. Kreienberg,et al.  Expression Profiling of Mammary Carcinoma Cell Lines: Correlation of in vitro Invasiveness with Expression of CD24 , 2002, Tumor Biology.

[19]  Olga G. Troyanskaya,et al.  Global Prediction of Tissue-Specific Gene Expression and Context-Dependent Gene Networks in Caenorhabditis elegans , 2009, PLoS Comput. Biol..

[20]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Haifeng Li,et al.  Systematic discovery of functional modules and context-specific functional annotation of human genome , 2007, ISMB/ECCB.

[22]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Lloyd J. Old,et al.  Cancer/testis antigens, gametogenesis and cancer , 2005, Nature Reviews Cancer.

[24]  K. Nouso,et al.  Expression of MAGE, GAGE and BAGE genes in human liver diseases: utility as molecular markers for hepatocellular carcinoma. , 2000, Journal of hepatology.

[25]  Hal Daumé,et al.  Co-regularized Multi-view Spectral Clustering , 2011, NIPS.

[26]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[27]  I. Ruschenburg,et al.  mRNA detection of tumor‐rejection genes BAGE, GAGE, and MAGE in peritoneal fluid from patients with ovarian carcinoma as a potential diagnostic tool , 2002, Cancer.

[28]  Yi Pan,et al.  A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Luonan Chen,et al.  Network‐Based Prediction of Protein Function , 2009 .

[30]  Hongyu Zhao,et al.  Community identification in networks with unbalanced structure. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Limsoon Wong,et al.  Exploiting Indirect Neighbours and Topological Weight to Predict Protein Function from Protein-Protein Interactions , 2006, BioDM.

[32]  S. Horvath,et al.  A General Framework for Weighted Gene Co-Expression Network Analysis , 2005, Statistical applications in genetics and molecular biology.

[33]  S. Urieli-Shoval,et al.  Expression and function of serum amyloid A, a major acute-phase protein, in normal and disease states , 2000, Current opinion in hematology.

[34]  Alberto Mantovani,et al.  Cancer-related inflammation, the seventh hallmark of cancer: links to genetic instability. , 2009, Carcinogenesis.

[35]  Chuanshu Huang,et al.  Inflammation, a Key Event in Cancer Development , 2006, Molecular Cancer Research.

[36]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[37]  Minyoung Lee,et al.  The expression of MAGE and GAGE genes in uterine cervical carcinoma of Korea by RT-PCR with common primers. , 2005, Gynecologic oncology.

[38]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[39]  Bharti Odhav,et al.  Immune responses in cancer. , 2003, Pharmacology & therapeutics.

[40]  Yi Pan,et al.  Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data , 2012, BMC Bioinformatics.

[41]  Daniele Soria,et al.  Global histone modifications in breast cancer correlate with tumor phenotypes, prognostic factors, and patient outcome. , 2009, Cancer research.

[42]  K. Stoeber,et al.  The cell cycle and cancer , 2012, The Journal of pathology.

[43]  Mason A. Porter,et al.  Communities in Networks , 2009, ArXiv.

[44]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[45]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Rosalind Ramsey-Goldman,et al.  Exploring the links between systemic lupus erythematosus and cancer. , 2005, Rheumatic diseases clinics of North America.

[47]  Wei Liu,et al.  Community detection in disease-gene network based on principal component analysis , 2013 .

[48]  Haifeng Li,et al.  Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation , 2011, PLoS Comput. Biol..

[49]  Stephen M Warren,et al.  Obesity impairs wound closure through a vasculogenic mechanism , 2012, Wound repair and regeneration : official publication of the Wound Healing Society [and] the European Tissue Repair Society.

[50]  Xiang-Sun Zhang,et al.  Common community structure in time-varying networks. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[51]  Alessandro Vespignani,et al.  Global protein function prediction from protein-protein interaction networks , 2003, Nature Biotechnology.

[52]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[53]  Richard H. White,et al.  Cancer risk in a cohort of patients with systemic lupus erythematosus (SLE) in California , 2008, Cancer Causes & Control.

[54]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[55]  David Martin,et al.  Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network , 2003, Genome Biology.

[56]  Pauline M. Rudd,et al.  Glycomic and glycoproteomic analysis of serum from patients with stomach cancer reveals potential markers arising from host defense response mechanisms. , 2011, Journal of proteome research.

[57]  J. P. Adams,et al.  Obesity in anaesthesia and intensive care. , 2000, British journal of anaesthesia.

[58]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[59]  Yang Liu,et al.  Recent duplication and positive selection of the GAGE gene family , 2008, Genetica.

[60]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[62]  Orly Alter,et al.  GSVD Comparison of Patient-Matched Normal and Tumor aCGH Profiles Reveals Global Copy-Number Alterations Predicting Glioblastoma Multiforme Survival , 2012, PloS one.

[63]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[64]  M. Kearney,et al.  Diurnal Variation in Vascular and Metabolic Function in Diet-Induced Obesity , 2013, Diabetes.

[65]  Johan A. K. Suykens,et al.  Optimized data fusion for K-means Laplacian clustering , 2011, Bioinform..

[66]  S. Gómez-Martínez,et al.  Obesity, inflammation and the immune system , 2012, Proceedings of the Nutrition Society.

[67]  Michael A. Gonzalez,et al.  Cell‐cycle‐dependent regulation of DNA replication and its relevance to cancer pathology , 2005, The Journal of pathology.

[68]  Wei Tang,et al.  Clustering with Multiple Graphs , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[69]  Yi Pan,et al.  A comparison of the functional modules identified from time course and static PPI network data , 2011, BMC Bioinformatics.

[70]  Ling Qi,et al.  Mechanisms of inflammatory responses in obese adipose tissue. , 2012, Annual review of nutrition.

[71]  Ernesto Estrada,et al.  Communicability in complex networks. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[72]  Jingchun Chen,et al.  Detecting functional modules in the yeast protein-protein interaction network , 2006, Bioinform..