BART: bioinformatics array research tool

BackgroundMicroarray experiments comprise more than half of all series in the Gene Expression Omnibus (GEO). However, downloading and analyzing raw or semi-processed microarray data from GEO is not intuitive and requires manual error-prone analysis and a bioinformatics background. This is due to a lack of standardization in array platform fabrication as well as the lack of a simple interactive tool for clustering, plotting, differential expression testing, and testing for functional enrichment.ResultsWe introduce the Bioinformatics Array Research Tool (BART), an R Shiny web application that automates the microarray download and analysis process across diverse microarray platforms. It provides an intuitive interface, automatically downloads and parses data from GEO, suggests groupings of samples for differential expression testing, performs batch effect correction, outputs quality control plots, converts probe IDs, generates full lists of differentially expressed genes, and performs functional enrichment analysis. We show that BART enables a more comprehensive analysis of a wider range of microarray datasets on GEO by comparing it to four leading online microarray analysis tools.ConclusionsBART allows a scientist with no bioinformatics background to extract knowledge from their own microarray data or microarray experiments available from GEO. BART is functional on more microarray experiments and provides more comprehensive analyses than extant microarray analysis tools. BART is hosted on bart.salk.edu, includes a user tutorial, and is available for download from https://bitbucket.org/Luisa_amaral/bart.

[1]  Sean R. Davis,et al.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor , 2007, Bioinform..

[2]  Andrew D. Rouillard,et al.  GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions , 2015, Bioinform..

[3]  Rafael A. Irizarry,et al.  A framework for oligonucleotide microarray preprocessing , 2010, Bioinform..

[4]  Adeline R. Whitney,et al.  Autosomal dominant and sporadic monocytopenia with susceptibility to mycobacteria, fungi, papillomaviruses, and myelodysplasia. , 2010, Blood.

[5]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[6]  Peter J. Woolf,et al.  GAGE: generally applicable gene set enrichment for pathway analysis , 2009, BMC Bioinformatics.

[7]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[8]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[9]  Jing Wang,et al.  WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013 , 2013, Nucleic Acids Res..

[10]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[11]  Garrett M. Dancik,et al.  shinyGEO: a web-based application for analyzing gene expression omnibus datasets , 2016, Bioinform..

[12]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[13]  Jaime Prilusky,et al.  GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support , 1998, Bioinform..

[14]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[15]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..