Geometric Interpretation of Gene Coexpression Network Analysis

The merging of network theory and microarray data analysis techniques has spawned a new field: gene coexpression network analysis. While network methods are increasingly used in biology, the network vocabulary of computational biologists tends to be far more limited than that of, say, social network theorists. Here we review and propose several potentially useful network concepts. We take advantage of the relationship between network theory and the field of microarray data analysis to clarify the meaning of and the relationship among network concepts in gene coexpression networks. Network theory offers a wealth of intuitive concepts for describing the pairwise relationships among genes, which are depicted in cluster trees and heat maps. Conversely, microarray data analysis techniques (singular value decomposition, tests of differential expression) can also be used to address difficult problems in network theory. We describe conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provide a rough dictionary for translating between the two fields. Using the angular interpretation of correlations, we provide a geometric interpretation of network theoretic concepts and derive unexpected relationships among them. We use the singular value decomposition of module expression data to characterize approximately factorizable gene coexpression networks, i.e., adjacency matrices that factor into node specific contributions. High and low level views of coexpression networks allow us to study the relationships among modules and among module genes, respectively. We characterize coexpression networks where hub genes are significant with respect to a microarray sample trait and show that the network concept of intramodular connectivity can be interpreted as a fuzzy measure of module membership. We illustrate our results using human, mouse, and yeast microarray gene expression data. The unification of coexpression network methods with traditional data mining methods can inform the application and development of systems biologic methods.

[1]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[2]  A. Barabasi,et al.  Global organization of metabolic fluxes in the bacterium Escherichia coli , 2004, Nature.

[3]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  J. Collado-Vides,et al.  Modular analysis of the transcriptional regulatory network of E. coli. , 2005, Trends in genetics : TIG.

[5]  Debashis Ghosh,et al.  Eigengene-based linear discriminant model for tumor classification using gene expression microarray data , 2006, Bioinform..

[6]  Réka Albert,et al.  Modeling Systems-Level Regulation of Host Immune Responses , 2007, PLoS Comput. Biol..

[7]  Adam P. Arkin,et al.  Orthologous Transcription Factors in Bacteria Have Different Functions and Regulate Different Genes , 2007, PLoS Comput. Biol..

[8]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[9]  T. Snijders The degree variance: An index of graph heterogeneity , 1981 .

[10]  A. Loraine,et al.  Transcriptional Coordination of the Metabolic Network in Arabidopsis1[W][OA] , 2006, Plant Physiology.

[11]  R. Fisher 014: On the "Probable Error" of a Coefficient of Correlation Deduced from a Small Sample. , 1921 .

[12]  Michael A. Langston,et al.  Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms , 2006, PLoS Comput. Biol..

[13]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[14]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[16]  Jing Yu,et al.  Computational Inference of Neural Information Flow Networks , 2006, PLoS Comput. Biol..

[17]  Bernhard O. Palsson,et al.  Iterative Reconstruction of Transcriptional Regulatory Networks: An Algorithmic Approach , 2006, PLoS Comput. Biol..

[18]  Neal S. Holter,et al.  Fundamental patterns underlying gene expression profiles: simplicity from complexity. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Martin Steffen,et al.  Automated modelling of signal transduction networks , 2002, BMC Bioinformatics.

[20]  Peter Langfelder,et al.  Eigengene networks for studying the relationships between co-expression modules , 2007, BMC Systems Biology.

[21]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[22]  J. Mesirov,et al.  Metagene projection for cross-platform, cross-species characterization of global transcriptional states , 2007, Proceedings of the National Academy of Sciences.

[23]  W. Wong,et al.  Transitive functional annotation by shortest-path analysis of gene expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Andrew J. Bulpitt,et al.  A Primer on Learning in Bayesian Networks for Computational Biology , 2007, PLoS Comput. Biol..

[25]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[26]  Adam Godzik,et al.  Comparative analysis of protein domain organization. , 2004, Genome research.

[27]  Z. N. Oltvai,et al.  Topological units of environmental signal processing in the transcriptional regulatory network of Escherichia coli , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[28]  S. Horvath,et al.  Conservation and evolution of gene coexpression networks in human and chimpanzee brains , 2006, Proceedings of the National Academy of Sciences.

[29]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[30]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[31]  S. Horvath,et al.  Evidence for anti-Burkitt tumour globulins in Burkitt tumour patients and healthy individuals. , 1967, British Journal of Cancer.

[32]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[33]  Leon Glass,et al.  Reverse Engineering the Gap Gene Network of Drosophila melanogaster , 2006, PLoS Comput. Biol..

[34]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[35]  Haifeng Li,et al.  Systematic discovery of functional modules and context-specific functional annotation of human genome , 2007, ISMB/ECCB.

[36]  S. Horvath,et al.  Weighted gene coexpression network analysis strategies applied to mouse weight , 2007, Mammalian Genome.

[37]  S. Horvath,et al.  Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target , 2006, Proceedings of the National Academy of Sciences.

[38]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[39]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[40]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Sven Bergmann,et al.  Defining transcription modules using large-scale gene expression data , 2004, Bioinform..

[42]  Christian V. Forst,et al.  Differential network expression during drug and stress response , 2005, Bioinform..

[43]  Aiqing He,et al.  Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids , 2006, Proceedings of the National Academy of Sciences.

[44]  Jun Dong,et al.  Understanding network concepts in modules , 2007, BMC Systems Biology.

[45]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[47]  Andy M. Yip,et al.  Gene network interconnectedness and the generalized topological overlap measure , 2007, BMC Bioinformatics.

[48]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[49]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[50]  Duncan J Watts,et al.  A simple model of global cascades on random networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[52]  S. Horvath,et al.  Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks , 2006, BMC Genomics.

[53]  Bin Zhang,et al.  Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R , 2008, Bioinform..

[54]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[55]  Albert-László Barabási,et al.  Error and attack tolerance of complex networks , 2000, Nature.

[56]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Steve Horvath,et al.  Network neighborhood analysis with the multi-node topological overlap measure , 2007, Bioinform..

[58]  S. Bergmann,et al.  Similarities and Differences in Genome-Wide Expression Data of Six Organisms , 2003, PLoS biology.

[59]  Michael Griffin,et al.  Gene co-expression network topology provides a framework for molecular characterization of cellular state , 2004, Bioinform..

[60]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[61]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[62]  An-Ping Zeng,et al.  Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach , 2004, BMC Bioinformatics.