Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data

Network analysis transcends conventional pairwise approaches to data analysis as the context of components in a network graph can be taken into account. Such approaches are increasingly being applied to genomics data, where functional linkages are used to connect genes or proteins. However, while microarray gene expression datasets are now abundant and of high quality, few approaches have been developed for analysis of such data in a network context. We present a novel approach for 3-D visualisation and analysis of transcriptional networks generated from microarray data. These networks consist of nodes representing transcripts connected by virtue of their expression profile similarity across multiple conditions. Analysing genome-wide gene transcription across 61 mouse tissues, we describe the unusual topography of the large and highly structured networks produced, and demonstrate how they can be used to visualise, cluster, and mine large datasets. This approach is fast, intuitive, and versatile, and allows the identification of biological relationships that may be missed by conventional analysis techniques. This work has been implemented in a freely available open-source application named BioLayout Express 3D.

[1]  U. Alon Biological Networks: The Tinkerer as an Engineer , 2003, Science.

[2]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[3]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[4]  A. Fraser,et al.  A first-draft human protein-interaction map , 2004, Genome Biology.

[5]  B. Snel,et al.  Predicting gene function by conserved co-expression. , 2003, Trends in genetics : TIG.

[6]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[8]  C. Wijmenga,et al.  Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. , 2006, American journal of human genetics.

[9]  Christian von Mering,et al.  STRING: a database of predicted functional associations between proteins , 2003, Nucleic Acids Res..

[10]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[11]  Rafael A. Irizarry,et al.  Stochastic models inspired by hybridization theory for short oligonucleotide arrays , 2004, J. Comput. Biol..

[12]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[13]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[14]  Sophia Tsoka,et al.  Beyond 100 genomes , 2003, Genome Biology.

[15]  Olivier Bodenreider,et al.  Global similarity and local divergence in human and mouse gene co-expression networks , 2006, BMC Evolutionary Biology.

[16]  S. Dongen Graph clustering by flow simulation , 2000 .

[17]  Paul Nurse,et al.  Systems biology: Understanding cells , 2003, Nature.

[18]  Marvin Cassman,et al.  Barriers to progress in systems biology , 2005, Nature.

[19]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[20]  Wei-Min Liu,et al.  Analysis of high density expression microarrays with signed-rank call algorithms , 2002, Bioinform..

[21]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[22]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[23]  R. Milo,et al.  Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Joshua M. Stuart,et al.  A Gene Expression Map for Caenorhabditis elegans , 2001, Science.

[25]  B. Palsson,et al.  Towards multidimensional genome annotation , 2006, Nature Reviews Genetics.

[26]  Zhijin Wu,et al.  Preprocessing of oligonucleotide array data , 2004, Nature Biotechnology.

[27]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[28]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[29]  J. Thornton,et al.  Relationship between the tissue-specificity of mouse gene expression and the evolutionary origin and function of the proteins , 2005, Genome Biology.

[30]  J. Thornton,et al.  Relating tissue specialization to the differentiation of expression of singleton and duplicate mouse proteins , 2006, Genome Biology.

[31]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[32]  Anton J. Enright,et al.  Detection of functional modules from protein interaction networks , 2003, Proteins.

[33]  Anton J. Enright,et al.  Protein families and TRIBES in genome sequence space. , 2003, Nucleic acids research.

[34]  S. Shen-Orr,et al.  Networks Network Motifs : Simple Building Blocks of Complex , 2002 .

[35]  May D. Wang,et al.  GoMiner: a resource for biological interpretation of genomic and proteomic data , 2003, Genome Biology.