Understanding network concepts in modules

BackgroundNetwork concepts are increasingly used in biology and genetics. For example, the clustering coefficient has been used to understand network architecture; the connectivity (also known as degree) has been used to screen for cancer targets; and the topological overlap matrix has been used to define modules and to annotate genes. Dozens of potentially useful network concepts are known from graph theory.ResultsHere we study network concepts in special types of networks, which we refer to as approximately factorizable networks. In these networks, the pairwise connection strength (adjacency) between 2 network nodes can be factored into node specific contributions, named node 'conformity'. The node conformity turns out to be highly related to the connectivity. To provide a formalism for relating network concepts to each other, we define three types of network concepts: fundamental-, conformity-based-, and approximate conformity-based concepts. Fundamental concepts include the standard definitions of connectivity, density, centralization, heterogeneity, clustering coefficient, and topological overlap. The approximate conformity-based analogs of fundamental network concepts have several theoretical advantages. First, they allow one to derive simple relationships between seemingly disparate networks concepts. For example, we derive simple relationships between the clustering coefficient, the heterogeneity, the density, the centralization, and the topological overlap. The second advantage of approximate conformity-based network concepts is that they allow one to show that fundamental network concepts can be approximated by simple functions of the connectivity in module networks.ConclusionUsing protein-protein interaction, gene co-expression, and simulated data, we show that a) many networks comprised of module nodes are approximately factorizable and b) in these types of networks, simple relationships exist between seemingly disparate network concepts. Our results are implemented in freely available R software code, which can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/ModuleConformity/ModuleNetworks

[1]  Andrew Meade,et al.  Assembly rules for protein networks derived from phylogenetic-statistical analysis of whole genomes , 2007, BMC Evolutionary Biology.

[2]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  Andy M. Yip,et al.  Gene network interconnectedness and the generalized topological overlap measure , 2007, BMC Bioinformatics.

[6]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[7]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[8]  Andrey A Mironov,et al.  A metabolic network in the evolutionary context: multiscale structure and modularity. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Susumu Goto,et al.  Regulation of metabolic networks by small molecule metabolites , 2007, BMC Bioinformatics.

[10]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[11]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[12]  G. Church,et al.  Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae , 2001, Nature Genetics.

[13]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[14]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[15]  Michael Griffin,et al.  Gene co-expression network topology provides a framework for molecular characterization of cellular state , 2004, Bioinform..

[16]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[17]  T. Snijders The degree variance: An index of graph heterogeneity , 1981 .

[18]  An-Ping Zeng,et al.  The Connectivity Structure, Giant Strong Component and Centrality of Metabolic Networks , 2003, Bioinform..

[19]  Jan de Leeuw,et al.  Block-relaxation Algorithms in Statistics , 1994 .

[20]  Brad T. Sherman,et al.  DAVID: Database for Annotation, Visualization, and Integrated Discovery , 2003, Genome Biology.

[21]  Aiqing He,et al.  Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids , 2006, Proceedings of the National Academy of Sciences.

[22]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[23]  Mike Tyers,et al.  The GRID: The General Repository for Interaction Datasets , 2003, Genome Biology.

[24]  Hermann-Georg Holzhütter,et al.  METANNOGEN: compiling features of biochemical reactions needed for the reconstruction of metabolic networks , 2007, BMC Syst. Biol..

[25]  Kwang-Hyun Cho,et al.  Least-squares methods for identifying biochemical regulatory networks from noisy measurements , 2007, BMC Bioinformatics.

[26]  Steffen Klamt,et al.  Structural and functional analysis of cellular networks with CellNetAnalyzer , 2007, BMC Systems Biology.

[27]  I. Jolliffe,et al.  Nonlinear Multivariate Analysis , 1992 .

[28]  George Michailidis,et al.  [Optimization Transfer Using Surrogate Objective Functions]: Discussion , 2000 .

[29]  S. Horvath,et al.  Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target , 2006, Proceedings of the National Academy of Sciences.

[30]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[31]  Duncan J Watts,et al.  A simple model of global cascades on random networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Vassilios Sotiropoulos,et al.  Synthetic tetracycline-inducible regulatory networks: computer-aided design of dynamic phenotypes , 2007, BMC Systems Biology.

[33]  Jason A. Papin,et al.  Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. , 2003, Biophysical journal.

[34]  B. Bollobás The evolution of random graphs , 1984 .

[35]  G. Caldarelli,et al.  Vertex intrinsic fitness: how to produce arbitrary scale-free networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[36]  S. Horvath,et al.  Conservation and evolution of gene coexpression networks in human and chimpanzee brains , 2006, Proceedings of the National Academy of Sciences.

[37]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[38]  Adam Godzik,et al.  Comparative analysis of protein domain organization. , 2004, Genome research.

[39]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[40]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[41]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[42]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[43]  N. Konno,et al.  Geographical threshold graphs with small-world and scale-free properties. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Steve Horvath,et al.  Network neighborhood analysis with the multi-node topological overlap measure , 2007, Bioinform..

[45]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Gene H Golub,et al.  Reconstructing the pathways of a cellular system from genome-scale signals by using matrix and tensor computations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[47]  A. Rbnyi ON THE EVOLUTION OF RANDOM GRAPHS , 2001 .

[48]  Eric J. Deeds,et al.  networks From The Cover : A simple physical model for scaling in protein-protein interaction , 2009 .

[49]  M. Hill,et al.  NONLINEAR MULTIVARIATE ANALYSIS , 1990 .

[50]  Thomas A. Henzinger,et al.  Qualitative networks: a symbolic approach to analyze biological signaling networks , 2007, BMC Systems Biology.

[51]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[52]  S. Horvath,et al.  Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks , 2006, BMC Genomics.

[53]  Eric J. Deeds,et al.  A simple physical model for scaling in protein-protein interaction networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Wendy R. Fox,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1991 .

[55]  Jing Zhao,et al.  Hierarchical modularity of nested bow-ties in metabolic networks , 2006, BMC Bioinformatics.