Gene Expression Omnibus ( GEO ) : Microarray data storage , submission , retrieval , and analysis

The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information archives and freely distributes high-throughput molecular abundance data, predominantly gene expression data generated by DNA microarray technology. The database has a flexible design that can handle diverse styles of both unprocessed and processed data in a Minimum Information About a Microarray Experiment-supportive infrastructure that promotes fully annotated submissions. GEO currently stores about a billion individual gene expression measurements, derived from over 100 organisms, submitted by over 1500 laboratories, addressing a wide range of biological phenomena. To maximize the utility of these data, several user-friendly web-based interfaces and applications have been implemented that enable effective exploration, query, and visualization of these data at the level of individual genes or entire studies. This chapter describes how data are stored, submission procedures, and mechanisms for data retrieval and query. GEO is publicly accessible at http://www.ncbi.nlm.nih.gov/projects/geo/.

[1]  Andrew Young,et al.  OntologyTraverser: an R package for GO analysis , 2005, Bioinform..

[2]  Michael Gribskov,et al.  Use of keyword hierarchies to interpret gene expression patterns , 2001, Bioinform..

[3]  Satoru Miyano,et al.  Superiority of network motifs over optimal networks and an application to the revelation of gene network evolution , 2005, Bioinform..

[4]  C. Ball,et al.  Microarray Data Standards: An Open Letter , 2004, Environmental Health Perspectives.

[5]  Shawn M. Burgess,et al.  High-Resolution Genome-Wide Mapping of Transposon Integration in Mammals , 2005, Molecular and Cellular Biology.

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Theresa A. Storm,et al.  Large-Scale Molecular Characterization of Adeno-Associated Virus Vector Integration in Mouse Liver , 2005, Journal of Virology.

[8]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[9]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..

[10]  Warren S Alexander,et al.  Suppressor of cytokine signaling-2 deficiency induces molecular and metabolic changes that partially overlap with growth hormone-dependent effects. , 2005, Molecular endocrinology.

[11]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[12]  W. Liang,et al.  TM4 microarray software suite. , 2006, Methods in enzymology.

[13]  John Quackenbush Genomics. Microarrays--guilt by association. , 2003, Science.

[14]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[15]  Vincent J Carey,et al.  Bioconductor: an open source framework for bioinformatics and computational biology. , 2006, Methods in enzymology.

[16]  L Yue,et al.  Pathway and ontology analysis: emerging approaches connecting transcriptome data and clinical endpoints. , 2005, Current molecular medicine.

[17]  Andrea Vijverberg,et al.  Clustering Microarray Data , 2007 .

[18]  James Gao,et al.  The lacrimal gland transcriptome is an unusually rich source of rare and poorly characterized gene transcripts. , 2005, Investigative ophthalmology & visual science.

[19]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[20]  E. Koonin,et al.  Conservation and coevolution in the scale-free human gene coexpression network. , 2004, Molecular biology and evolution.

[21]  W. Wong,et al.  Functional annotation and network reconstruction through cross-platform integration of microarray data , 2005, Nature Biotechnology.

[22]  Helen Parkinson,et al.  Data storage and analysis in ArrayExpress. , 2006, Methods in enzymology.

[23]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Martin Brockington,et al.  Localization and functional analysis of the LARGE family of glycosyltransferases: significance for muscular dystrophy. , 2005, Human molecular genetics.

[25]  Jason E. Stewart,et al.  Design and implementation of microarray gene expression markup language (MAGE-ML) , 2002, Genome Biology.

[26]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.

[27]  Yee Hwa Yang,et al.  Freshly isolated rat alveolar type I cells, type II cells, and cultured type II cells have distinct molecular phenotypes. , 2005, American journal of physiology. Lung cellular and molecular physiology.

[28]  C. Ball,et al.  Submission of Microarray Data to Public Repositories , 2004, PLoS biology.

[29]  Helen Parkinson,et al.  Using ontologies to annotate microarray experiments. , 2006, Methods in enzymology.