Quantitative assessment of gene expression network module-validation methods

Validation of pluripotent modules in diverse networks holds enormous potential for systems biology and network pharmacology. An arising challenge is how to assess the accuracy of discovering all potential modules from multi-omic networks and validating their architectural characteristics based on innovative computational methods beyond function enrichment and biological validation. To display the framework progress in this domain, we systematically divided the existing Computational Validation Approaches based on Modular Architecture (CVAMA) into topology-based approaches (TBA) and statistics-based approaches (SBA). We compared the available module validation methods based on 11 gene expression datasets, and partially consistent results in the form of homogeneous models were obtained with each individual approach, whereas discrepant contradictory results were found between TBA and SBA. The TBA of the Zsummary value had a higher Validation Success Ratio (VSR) (51%) and a higher Fluctuation Ratio (FR) (80.92%), whereas the SBA of the approximately unbiased (AU) p-value had a lower VSR (12.3%) and a lower FR (45.84%). The Gray area simulated study revealed a consistent result for these two models and indicated a lower Variation Ratio (VR) (8.10%) of TBA at 6 simulated levels. Despite facing many novel challenges and evidence limitations, CVAMA may offer novel insights into modular networks.

[1]  J. Dopazo,et al.  Assessing the Biological Significance of Gene Expression Signatures and Co-Expression Modules by Studying Their Network Properties , 2011, PloS one.

[2]  Lin Wang,et al.  Modular analysis of the probabilistic genetic interaction network , 2011, Bioinform..

[3]  Kun Zhang,et al.  svdPPCS: an effective singular value decomposition-based method for conserved and divergent co-expression gene module identification , 2010, BMC Bioinformatics.

[4]  Takeaki Uno,et al.  Enumeration of condition-dependent dense modules in protein interaction networks , 2009, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[5]  Lin Gao,et al.  International Journal of Biological Sciences , 2011 .

[6]  Golan Yona,et al.  Comparing algorithms for clustering of expression data: how to assess gene clusters. , 2009, Methods in molecular biology.

[7]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Yijie Wang,et al.  A novel subgradient-based optimization algorithm for blockmodel functional module identification , 2013, BMC Bioinformatics.

[9]  Kyungsook Han,et al.  ModuleSearch: finding functional modules in a protein–protein interaction network , 2012, Computer methods in biomechanics and biomedical engineering.

[10]  Igor Jurisica,et al.  Protein complex prediction via cost-based clustering , 2004, Bioinform..

[11]  Euan A. Ashley,et al.  Gene Coexpression Network Topology of Cardiac Development, Hypertrophy, and Failure , 2011, Circulation. Cardiovascular genetics.

[12]  Chuan Yi Tang,et al.  A novel method to identify cooperative functional modules: study of module coordination in the Saccharomyces cerevisiae cell cycle , 2011, BMC Bioinformatics.

[13]  David Warde-Farley,et al.  Dynamic modularity in protein interaction networks predicts breast cancer outcome , 2009, Nature Biotechnology.

[14]  Yi Pan,et al.  Biological network motif detection and evaluation , 2011, BMC Systems Biology.

[15]  Alexandre P. Francisco,et al.  Regulatory Snapshots: Integrative Mining of Regulatory Modules from Expression Time Series and Regulatory Networks , 2012, PloS one.

[16]  Luonan Chen,et al.  Identification of dysfunctional modules and disease genes in congenital heart disease by a network-based approach , 2011, BMC Genomics.

[17]  Gregory Butler,et al.  An integrative approach to infer regulation programs in a transcription regulatory module network , 2011, BCB '11.

[18]  Luonan Chen,et al.  Identifying Responsive Modules by Mathematical Programming: An Application to Budding Yeast Cell Cycle , 2012, PloS one.

[19]  Tobias Müller,et al.  Identifying functional modules in protein–protein interaction networks: an integrated exact approach , 2008, ISMB.

[20]  M J Sanderson,et al.  Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae). , 2000, Systematic biology.

[21]  Richard C. Davis,et al.  A systems genetic analysis of high density lipoprotein metabolism and network preservation across mouse models. , 2012, Biochimica et biophysica acta.

[22]  Martin Ester,et al.  Module Discovery by Exhaustive Search for Densely Connected, Co-Expressed Regions in Biomolecular Interaction Networks , 2010, PloS one.

[23]  Aldons J. Lusis,et al.  Network for Activation of Human Endothelial Cells by Oxidized Phospholipids: A Critical Role of Heme Oxygenase 1 , 2011, Circulation research.

[24]  Brian J. Bennett,et al.  Maximal information component analysis: a novel non-linear network analysis method , 2013, Front. Genet..

[25]  N. Street,et al.  A systems biology model of the regulatory network in Populus leaves reveals interacting regulators and conserved regulation , 2011, BMC Plant Biology.

[26]  Rui Luo,et al.  Is My Network Module Preserved and Reproducible? , 2011, PLoS Comput. Biol..

[27]  Lin Gao,et al.  Discovering protein complexes in protein interaction networks via exploring the weak ties effect , 2012, BMC Systems Biology.

[28]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[29]  M. Pradhan,et al.  A systems biology approach to the global analysis of transcription factors in colorectal cancer , 2012, BMC Cancer.

[30]  Long J. Lu,et al.  Investigating the validity of current network analysis on static conglomerate networks by protein network stratification , 2010, BMC Bioinformatics.

[31]  Xin Wang,et al.  Posterior Association Networks and Functional Modules Inferred from Rich Phenotypes of Gene Perturbations , 2012, PLoS Comput. Biol..

[32]  Tam'as Vicsek,et al.  Modularity measure of networks with overlapping communities , 2009, 0910.5072.

[33]  Robert Clarke,et al.  Identifying protein interaction subnetworks by a bagging Markov random field-based method , 2012, Nucleic acids research.

[34]  Nagiza F. Samatova,et al.  Efficient alpha, beta-motif Finder for Identification of Phenotype-related Functional Modules , 2011, BMC Bioinform..

[35]  Bing Zhang,et al.  Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression , 2010, BMC Systems Biology.

[36]  Luonan Chen,et al.  Coexpression network analysis in chronic hepatitis B and C hepatic lesions reveals distinct patterns of disease progression to hepatocellular carcinoma. , 2012, Journal of molecular cell biology.

[37]  Amy V Kapp,et al.  Are clusters found in one dataset present in another dataset? , 2007, Biostatistics.

[38]  Douglas B. Kell,et al.  Computational cluster validation in post-genomic data analysis , 2005, Bioinform..

[39]  M. Ko,et al.  Spotlight: assembly of protein complexes by integrating graph clustering methods. , 2013, Gene.

[40]  Christina Chan,et al.  Using Dynamic Gene Module Map Analysis To Identify Targets That Modulate Free Fatty Acid Induced Cytotoxicity , 2008, Biotechnology progress.

[41]  René S. Kahn,et al.  A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes , 2012, PloS one.

[42]  Robert Clarke,et al.  Reverse engineering module networks by PSO-RNN hybrid modeling , 2009, BMC Genomics.

[43]  Evangelia I. Zacharaki,et al.  Revealing the dynamic modularity of composite biological networks in breast cancer treatment , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[44]  Hidetoshi Shimodaira An approximately unbiased test of phylogenetic tree selection. , 2002, Systematic biology.

[45]  John Quackenbush,et al.  Defining an informativeness metric for clustering gene expression data , 2011, Bioinform..

[46]  Fidel Ramírez,et al.  Computing topological parameters of biological networks , 2008, Bioinform..

[47]  Yang Xiang,et al.  Identifying Dynamic Network Modules with Temporal and Spatial Constraints , 2007, Pacific Symposium on Biocomputing.

[48]  Yinying Chen,et al.  Spatiotemporal positioning of multipotent modules in diverse biological networks , 2014, Cellular and Molecular Life Sciences.

[49]  Jun Dong,et al.  Understanding network concepts in modules , 2007, BMC Systems Biology.

[50]  Weixiong Zhang,et al.  Identifying network communities with a high resolution. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[51]  Nagiza F. Samatova,et al.  Functional Annotation of Hierarchical Modularity , 2012, PloS one.

[52]  Yan Zhang,et al.  Research and applications: An integrated approach to identify causal network modules of complex diseases with application to colorectal cancer , 2013, J. Am. Medical Informatics Assoc..

[53]  Xing-Ming Zhao,et al.  Identifying disease genes and module biomarkers by differential interactions , 2012, J. Am. Medical Informatics Assoc..

[54]  Haifeng Li,et al.  Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation , 2011, PLoS Comput. Biol..

[55]  Youping Deng,et al.  Recent advances in clustering methods for protein interaction networks , 2010, BMC Genomics.

[56]  Carlos Prieto,et al.  Human Gene Coexpression Landscape: Confident Network Derived from Tissue Transcriptomic Profiles , 2008, PloS one.

[57]  R. Karp,et al.  Optimization criteria and biological process enrichment in homologous multiprotein modules , 2013, Proceedings of the National Academy of Sciences.

[58]  Peilin Jia,et al.  Network-Assisted Investigation of Combined Causal Signals from Genome-Wide Association Studies in Schizophrenia , 2012, PLoS Comput. Biol..

[59]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[60]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[61]  Hedi Peterson,et al.  GraphWeb: mining heterogeneous biological networks for gene modules with functional significance , 2008, Nucleic Acids Res..

[62]  Kok-Leong Ong,et al.  Dynamical Systems for Discovering Protein Complexes and Functional Modules from Biological Networks , 2007, IEEE ACM Trans. Comput. Biol. Bioinform..

[63]  Zikai Wu,et al.  Identifying responsive functional modules from protein-protein interaction network , 2009, Molecules and cells.

[64]  Balázs Papp,et al.  Evaluation of predicted network modules in yeast metabolism using NMR-based metabolite profiling. , 2007, Genome research.

[65]  Hidetoshi Shimodaira,et al.  Pvclust: an R package for assessing the uncertainty in hierarchical clustering , 2006, Bioinform..

[66]  Teresa M. Przytycka,et al.  Module Cover - A New Approach to Genotype-Phenotype Studies , 2012, Pacific Symposium on Biocomputing.

[67]  Yu Sun,et al.  The discovery of transcriptional modules by a two-stage matrix decomposition approach , 2007, Bioinform..

[68]  Vasyl Pihur,et al.  Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach , 2007, Bioinform..

[69]  Matthias E. Futschik,et al.  Inferring modules from human protein interactome classes , 2010, BMC Systems Biology.

[70]  Ananth Grama,et al.  Modularity detection in protein-protein interaction networks , 2011, BMC Research Notes.

[71]  Yoshiyuki Ogata,et al.  Coexpression Analysis of Tomato Genes and Experimental Verification of Coordinated Expression of Genes Found in a Functionally Enriched Coexpression Module , 2010, DNA research : an international journal for rapid publication of reports on genes and genomes.

[72]  Soha Hassoun,et al.  Metabolic Flux-Based Modularity using Shortest Retroactive distances , 2012, BMC Systems Biology.

[73]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[74]  Jörg Schultz,et al.  Protein Interaction Networks—More Than Mere Modules , 2008, PLoS Comput. Biol..

[75]  Fang-Xiang Wu,et al.  Identification of Hierarchical and Overlapping Functional Modules in PPI Networks , 2012, IEEE Transactions on NanoBioscience.

[76]  Andrey Alexeyenko,et al.  MGclus: network clustering employing shared neighbors. , 2013, Molecular bioSystems.

[77]  Tobias Müller,et al.  Robustness and accuracy of functional modules in integrated network analysis , 2012, Bioinform..

[78]  E. Ziv,et al.  Information-theoretic approach to network modularity. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[79]  Ziv Bar-Joseph,et al.  Biological interaction networks are conserved at the module level , 2011, BMC Systems Biology.

[80]  K. Niklas,et al.  Dynamical patterning modules in plant development and evolution. , 2012, The International journal of developmental biology.

[81]  Kwang-Hyun Cho,et al.  The core regulation module of stress-responsive regulatory networks in yeast , 2012, Nucleic acids research.

[82]  S. Bergmann,et al.  Comparative Gene Expression Analysis by a Differential Clustering Approach: Application to the Candida albicans Transcription Program , 2005, PLoS genetics.

[83]  R. Winther Varieties of modules: kinds, levels, origins, and behaviors. , 2001, The Journal of experimental zoology.

[84]  Ning Ma,et al.  Evaluation of clustering algorithms for gene expression data using gene ontology annotations. , 2012, Chinese medical journal.

[85]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[86]  Christopher J. Rawlings,et al.  Assessing the functional coherence of modules found in multiple-evidence networks from Arabidopsis , 2011, BMC Bioinformatics.

[87]  Rob Knight,et al.  A Modular Organization of the Human Intestinal Mucosal Microbiota and Its Association with Inflammatory Bowel Disease , 2013, PloS one.

[88]  Ron Shamir,et al.  Identifying functional modules using expression profiles and confidence-scored protein interactions , 2009, Bioinform..

[89]  Dirk M. Lorenz,et al.  The emergence of modularity in biological systems. , 2011, Physics of life reviews.

[90]  Wojciech Szpankowski,et al.  Assessing Significance of Connectivity and Conservation in Protein Interaction Networks , 2006, RECOMB.

[91]  Paul Stothard,et al.  Gene co-expression network analysis identifies porcine genes associated with variation in Salmonella shedding , 2014, BMC Genomics.

[92]  Ming Wu,et al.  Gene module level analysis: identification to networks and dynamics. , 2008, Current opinion in biotechnology.

[93]  Jacques van Helden,et al.  Evaluation of clustering algorithms for protein-protein interaction networks , 2006, BMC Bioinformatics.

[94]  Cheng-Yan Kao,et al.  A quantitative analysis of monochromaticity in genetic interaction networks , 2011, BMC Bioinformatics.

[95]  Kim Sneppen,et al.  Pathway identification by network pruning in the metabolic network of Escherichia coli , 2009, Bioinform..

[96]  Gang Chen,et al.  Modifying the DPClus algorithm for identifying protein complexes based on new topological structures , 2008, BMC Bioinformatics.

[97]  James Bailey,et al.  Discovery and analysis of consistent active sub-networks in cancers , 2013, BMC Bioinformatics.

[98]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[99]  Nagiza F. Samatova,et al.  Efficient α, β-motif finder for identification of phenotype-related functional modules , 2011, BMC Bioinformatics.

[100]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[101]  Wojciech Szpankowski,et al.  Assessing Significance of Connectivity and Conservation in Protein Interaction Networks , 2007, J. Comput. Biol..

[102]  Mona Singh,et al.  Toward the dynamic interactome: it's about time , 2010, Briefings Bioinform..

[103]  Chung-Yen Lin,et al.  A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles , 2010, BMC Bioinformatics.