The Galaxy Track Browser: Transforming the genome browser from visualization tool to analysis tool

The proliferation of next-generation sequencing (NGS) technologies and analysis tools present new challenges to genome browsers. These challenges include supporting very large datasets, integrating analysis tools with data visualization to help reason about and improve analyses, and sharing or publishing fully interactive visualizations. The Galaxy Track Browser (GTB) is a Web-based genome browser integrated into the Galaxy platform that addresses these challenges. GTB is the first Web-based genome browser to provide a full multi-resolution data model; this model supports efficient data retrieval from very large datasets. GTB leverages the Galaxy platform to combine data visualization and data analysis; users can specify parameter values and run tools to produce new data, all within GTB. GTB also provides interactive filters that dynamically show and hide data and can be used to identify data for further investigation. GTB is available on every Galaxy server, and visualizations can be created for both standard and custom genome builds. Fully interactive GTB visualizations can be shared with colleagues and published on the Web using a simple graphical user interface.

[1]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[2]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[3]  Daniel J. Blankenberg,et al.  Galaxy: a platform for interactive large-scale genome analysis. , 2005, Genome research.

[4]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[5]  Galt P. Barber,et al.  BigWig and BigBed: enabling browsing of large distributed datasets , 2010, Bioinform..

[6]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[7]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[8]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[9]  Daniel J. Blankenberg,et al.  A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly. , 2007, Genome research.

[10]  Sean R. Eddy,et al.  The Distributed Annotation System , 2001, BMC Bioinformatics.

[11]  A. Mortazavi,et al.  Computation for ChIP-seq and RNA-seq studies , 2009, Nature Methods.

[12]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[13]  J. Mesirov,et al.  GenePattern 2.0 , 2006, Nature Genetics.

[14]  M. Cline,et al.  Understanding genome browsing , 2009, Nature Biotechnology.

[15]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[16]  Kristin A. Cook,et al.  Illuminating the Path: The Research and Development Agenda for Visual Analytics , 2005 .

[17]  Heng Li,et al.  Tabix: fast retrieval of sequence features from generic TAB-delimited files , 2011, Bioinform..

[18]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[19]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[20]  Ann E. Loraine,et al.  The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets , 2009, Bioinform..

[21]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer , 2011, Nature Biotechnology.

[22]  A. Nekrutenko,et al.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences , 2010, Genome Biology.

[23]  C. Ball,et al.  Repeatability of published microarray gene expression analyses , 2009, Nature Genetics.

[24]  Michael Brudno,et al.  Savant: genome browser for high-throughput sequencing data , 2010, Bioinform..

[25]  I. Dubchak,et al.  Visualizing genomes: techniques and challenges , 2010, Nature Methods.

[26]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[27]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[28]  E. Mardis Next-generation DNA sequencing methods. , 2008, Annual review of genomics and human genetics.

[29]  Ben Shneiderman,et al.  Hawkeye: an interactive visual analytics tool for genome assemblies , 2007, Genome Biology.