DSGeo: Software tools for cross-platform analysis of gene expression data in GEO

The Gene Expression Omnibus (GEO) is the largest resource of public gene expression data. While GEO enables data browsing, query and retrieval, additional tools can help realize its potential for aggregating and comparing data across multiple studies and platforms. This paper describes DSGeo-a collection of valuable tools that were developed for annotating, aggregating, integrating, and analyzing data deposited in GEO. The core set of tools include a Relational Database, a Data Loader, a Data Browser, and an Expression Combiner and Analyzer. The application enables querying for specific sample characteristics and identifying studies containing samples that match the query. The Expression Combiner application enables normalization and aggregation of data from these samples and returns these data to the user after filtering, according to the user's preferences. The Expression Analyzer allows simple statistical comparisons between groups of data. This seamless integration makes annotated cross-platform data directly available for analysis.

[1]  Tanya Barrett,et al.  Reannotation of array probes at NCBI's GEO database , 2008, Nature Methods.

[2]  Jun Lu,et al.  Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: High-resolution annotation for microarrays , 2007, BMC Bioinform..

[3]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[4]  A. Butte,et al.  AILUN: reannotating gene expression data automatically , 2007, Nature Methods.

[5]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[6]  Purvesh Khatri,et al.  Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments , 2004, Nucleic Acids Res..

[7]  Hui Yu,et al.  Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data , 2007, BMC Bioinformatics.

[8]  Franck Molina,et al.  A Gene Expression Signature that Can Predict the Recurrence of Tamoxifen-Treated Primary Breast Cancer , 2008, Clinical Cancer Research.

[9]  Michael G. Barnes,et al.  Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms , 2005, Nucleic acids research.

[10]  Helen E. Parkinson,et al.  ArrayExpress—a public database of microarray experiments and gene expression profiles , 2006, Nucleic Acids Res..

[11]  Lucila Ohno-Machado,et al.  Evaluation of a large-scale biomedical data annotation initiative , 2009, BMC Bioinformatics.

[12]  Byoung-Tak Zhang,et al.  CrossChip: a system supporting comparative analysis of different generations of Affymetrix arrays , 2005, Bioinform..

[13]  Joel Dudley,et al.  Enabling Integrative Genomic Analysis of High Impact Human Diseases Through Text Mining , 2007, Pacific Symposium on Biocomputing.

[14]  Lucila Ohno-Machado,et al.  Analysis of matched mRNA measurements from two different microarray technologies , 2002, Bioinform..

[15]  Kazuho Ikeo,et al.  CIBEX: center for information biology gene expression database. , 2003, Comptes rendus biologies.

[16]  Lucila Ohno-Machado,et al.  Automatic correspondence of tags and genes (ACTG): a tool for the analysis of SAGE, MPSS and SBS data , 2007, Bioinform..

[17]  M. J. van de Vijver,et al.  Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. , 2006, Journal of the National Cancer Institute.

[18]  Irina I. Abnizova,et al.  Swift: primary data analysis for the Illumina Solexa sequencing platform , 2009, Bioinform..

[19]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[20]  M. Marra,et al.  Applications of next-generation sequencing technologies in functional genomics. , 2008, Genomics.

[21]  Ash A. Alizadeh,et al.  Software tools for high-throughput analysis and archiving of immunohistochemistry staining data obtained with tissue microarrays. , 2002, The American journal of pathology.