Snpdat: Easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms

BackgroundSingle nucleotide polymorphisms (SNPs) are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data.ResultsHere we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat.ConclusionsSNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.

[1]  D. A. Magee,et al.  Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples , 2012, BMC Genomics.

[2]  A. Cutter,et al.  Natural selection shapes nucleotide polymorphism across the genome of the nematode Caenorhabditis briggsae. , 2010, Genome research.

[3]  Stephen J. Goodswen,et al.  FunctSNP: an R package to link SNPs to functional knowledge and dbAutoMaker: a suite of Perl scripts to build SNP databases , 2010, BMC Bioinformatics.

[4]  Adam Yao,et al.  Functional analysis of novel SNPs and mutations in human and mouse genomes , 2008, BMC Bioinformatics.

[5]  Laura J. Scott,et al.  SNP Function Portal: a web database for exploring the function implication of SNP alleles , 2006, ISMB.

[6]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[7]  A. Butte,et al.  Extreme Evolutionary Disparities Seen in Positive Selection across Seven Complex Diseases , 2010, PloS one.

[8]  M. Allan,et al.  Present and future applications of DNA technologies to improve beef production. , 2008, Meat science.

[9]  M. Mindrinos,et al.  SNP discovery and molecular evolution in Anopheles gambiae, with special emphasis on innate immune system , 2008, BMC Genomics.

[10]  Sarah A Tishkoff,et al.  Patterns of human genetic diversity: implications for human evolutionary history and disease. , 2003, Annual review of genomics and human genetics.

[11]  Alberto Riva,et al.  SNPper: retrieval and analysis of human SNPs , 2002, Bioinform..

[12]  Peter Tarczy-Hornoch,et al.  SNPit: A federated data integration system for the purpose of functional SNP annotation , 2009, Comput. Methods Programs Biomed..

[13]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[14]  Arshad Khan,et al.  SNPnexus: a web database for functional annotation of newly discovered and public domain single nucleotide polymorphisms , 2008, Bioinform..

[15]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[16]  Heng Li,et al.  Snap: an integrated SNP annotation platform , 2006, Nucleic Acids Res..

[17]  M. C. Ellis,et al.  Single nucleotide polymorphism markers for genetic mapping in Drosophila melanogaster. , 2001, Genome research.