Savant: genome browser for high-throughput sequencing data

MOTIVATION The advent of high-throughput sequencing (HTS) technologies has made it affordable to sequence many individuals' genomes. Simultaneously the computational analysis of the large volumes of data generated by the new sequencing machines remains a challenge. While a plethora of tools are available to map the resulting reads to a reference genome, and to conduct primary analysis of the mappings, it is often necessary to visually examine the results and underlying data to confirm predictions and understand the functional effects, especially in the context of other datasets. RESULTS We introduce Savant, the Sequence Annotation, Visualization and ANalysis Tool, a desktop visualization and analysis browser for genomic data. Savant was developed for visualizing and analyzing HTS data, with special care taken to enable dynamic visualization in the presence of gigabases of genomic reads and references the size of the human genome. Savant supports the visualization of genome-based sequence, point, interval and continuous datasets, and multiple visualization modes that enable easy identification of genomic variants (including single nucleotide polymorphisms, structural and copy number variants), and functional genomic information (e.g. peaks in ChIP-seq data) in the context of genomic annotations. AVAILABILITY Savant is freely available at http://compbio.cs.toronto.edu/savant.

[1]  I. Dubchak,et al.  Visualizing genomes: techniques and challenges , 2010, Nature Methods.

[2]  C. Alkan,et al.  MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions , 2009, Nature Methods.

[3]  Kim Rutherford,et al.  Artemis: sequence visualization and annotation , 2000, Bioinform..

[4]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[5]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[6]  Christopher Gignoux,et al.  The 1000 Genomes Project: new opportunities for research and social challenges , 2010, Genome Medicine.

[7]  Paul Medvedev,et al.  Computational methods for discovering structural variation with next-generation sequencing , 2009, Nature Methods.

[8]  Kathryn F. Beal,et al.  The Staden package, 1998. , 2000, Methods in molecular biology.

[9]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[10]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[11]  A. Mortazavi,et al.  Computation for ChIP-seq and RNA-seq studies , 2009, Nature Methods.

[12]  Derek Y. Chiang,et al.  High-resolution mapping of copy-number alterations with massively parallel sequencing , 2009, Nature Methods.

[13]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[14]  Andrew M. Jenkinson,et al.  Ensembl 2009 , 2008, Nucleic Acids Res..

[15]  Süleyman Cenk Sahinalp,et al.  Combinatorial Algorithms for Structural Variation Detection in High Throughput Sequenced Genomes , 2009, RECOMB.

[16]  Michael Brudno,et al.  Genome Variation Discovery with High-throughput Sequencing Data , 2022 .

[17]  Paul D. Shaw,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .