VarB: a variation browsing and analysis tool for variants derived from next-generation sequencing data

Summary: There is an immediate need for tools to both analyse and visualize in real-time single-nucleotide polymorphisms, insertions and deletions, and other structural variants from new sequence file formats. We have developed VarB software that can be used to visualize variant call format files in real time, as well as identify regions under balancing selection and informative markers to differentiate user-defined groups (e.g. populations). We demonstrate its utility using sequence data from 50 Plasmodium falciparum isolates comprising two different continents and confirm known signals from genomic regions that contain important antigenic and anti-malarial drug-resistance genes. Availability and implementation: The C++-based software VarB and user manual are available from www.pathogenseq.org/varb. Contact: taane.clark@lshtm.ac.uk

[1]  Matthew Berriman,et al.  Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database , 2008, Bioinform..

[2]  D. Conway,et al.  Allele Frequency–Based and Polymorphism-Versus-Divergence Indices of Balancing Selection in a New Filtered Set of Polymorphic Genes in Plasmodium falciparum , 2010, Molecular biology and evolution.

[3]  John C. Tan,et al.  Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing , 2012, Nature.

[4]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[5]  B. Sharp,et al.  Multiple Origins and Regional Dispersal of Resistant dhps in African Plasmodium falciparum Malaria , 2009, PLoS medicine.

[6]  W. G. Hill,et al.  Genetic Data Analysis II . By Bruce S. Weir, Sunderland, Massachusetts. Sinauer Associates, Inc.445 pages. ISBN 0-87893-902-4. , 1996 .

[7]  Samuel A. Assefa,et al.  Drug-Resistant Genotypes and Multi-Clonality in Plasmodium falciparum Analysed by Direct Genome Sequencing from Peripheral Blood of Malaria Patients , 2011, PloS one.

[8]  J. Ott Genetic data analysis II , 1997 .

[9]  Michael Brudno,et al.  Savant: genome browser for high-throughput sequencing data , 2010, Bioinform..

[10]  F. Tajima Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. , 1989, Genetics.

[11]  Matthew Berriman,et al.  BamView: visualizing and interpretation of next-generation sequencing read alignments , 2012, Briefings Bioinform..

[12]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[13]  John C. Wootton,et al.  Genetic diversity and chloroquine selective sweeps in Plasmodium falciparum , 2002, Nature.