TranscriptomeBrowser: A Powerful and Flexible Toolbox to Explore Productively the Transcriptional Landscape of the Gene Expression Omnibus Database

Background As public microarray repositories are constantly growing, we are facing the challenge of designing strategies to provide productive access to the available data. Methodology We used a modified version of the Markov clustering algorithm to systematically extract clusters of co-regulated genes from hundreds of microarray datasets stored in the Gene Expression Omnibus database (n = 1,484). This approach led to the definition of 18,250 transcriptional signatures (TS) that were tested for functional enrichment using the DAVID knowledgebase. Over-representation of functional terms was found in a large proportion of these TS (84%). We developed a JAVA application, TBrowser that comes with an open plug-in architecture and whose interface implements a highly sophisticated search engine supporting several Boolean operators (http://tagc.univ-mrs.fr/tbrowser/). User can search and analyze TS containing a list of identifiers (gene symbols or AffyIDs) or associated with a set of functional terms. Conclusions/Significance As proof of principle, TBrowser was used to define breast cancer cell specific genes and to detect chromosomal abnormalities in tumors. Finally, taking advantage of our large collection of transcriptional signatures, we constructed a comprehensive map that summarizes gene-gene co-regulations observed through all the experiments performed on HGU133A Affymetrix platform. We provide evidences that this map can extend our knowledge of cellular signaling pathways.

[1]  Sean R. Collins,et al.  Global landscape of protein complexes in the yeast Saccharomyces cerevisiae , 2006, Nature.

[2]  C. Ball,et al.  Microarray databases: standards and ontologies , 2002, Nature Genetics.

[3]  N. Hayward,et al.  Confirmation of a BRAF mutation-associated gene expression signature in melanoma. , 2007, Pigment cell research.

[4]  Ron Edgar,et al.  Gene Expression Omnibus ( GEO ) : Microarray data storage , submission , retrieval , and analysis , 2008 .

[5]  S. Valitutti,et al.  CD160-activating NK cell effector functions depend on the phosphatidylinositol 3-kinase recruitment. , 2007, International immunology.

[6]  J. Newell,et al.  OBF-1, a novel B cell-specific coactivator that stimulates immunoglobulin promoter activity through association with octamer-binding proteins , 1995, Cell.

[7]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  B. Korn,et al.  Characterization of a Cluster of Human High/Ultrahigh Sulfur Keratin-associated Protein Genes Embedded in the Type I Keratin Gene Domain on Chromosome 17q12-21* , 2001, The Journal of Biological Chemistry.

[9]  Stijn van Dongen,et al.  GeneMCL in microarray analysis , 2005, Comput. Biol. Chem..

[10]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[11]  P. Farnham,et al.  T-bet regulates the terminal maturation and homeostasis of NK and Valpha14i NKT cells. , 2004, Immunity.

[12]  Laurie J. Heyer,et al.  Exploring expression data: identification and analysis of coexpressed genes. , 1999, Genome research.

[13]  L. Holmberg,et al.  Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts , 2005, Breast Cancer Research.

[14]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[15]  Patrik D'haeseleer,et al.  How does gene expression clustering work? , 2005, Nature Biotechnology.

[16]  J. Inazawa,et al.  POU2AF1, an amplification target at 11q23, promotes growth of multiple myeloma cells by directly regulating expression of a B-cell maturation factor, TNFRSF17 , 2008, Oncogene.

[17]  S. Dongen A cluster algorithm for graphs , 2000 .

[18]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[19]  Dennis B. Troup,et al.  NCBI GEO: mining tens of millions of expression profiles—database and tools update , 2006, Nucleic Acids Res..

[20]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[21]  David Liu,et al.  DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis , 2007, BMC Bioinformatics.

[22]  David Botstein,et al.  SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data , 2003, Nucleic Acids Res..

[23]  G Leclercq,et al.  About GATA3, HNF3A, and XBP1, three genes co-expressed with the oestrogen receptor-α gene (ESR1) in breast cancer , 2004, Molecular and Cellular Endocrinology.

[24]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[25]  J. Renauld,et al.  Synergistic proliferation and activation of natural killer cells by interleukin 12 and interleukin 18. , 1999, Cytokine.

[26]  M. Colonna,et al.  Molecular characterization of a novel human natural killer cell receptor homologous to mouse 2B4. , 1999, Tissue antigens.