HaploSNPer: a web-based allele and SNP detection tool

BackgroundSingle nucleotide polymorphisms (SNPs) and small insertions or deletions (indels) are the most common type of polymorphisms and are frequently used for molecular marker development. Such markers have become very popular for all kinds of genetic analysis, including haplotype reconstruction. Haplotypes can be reconstructed for whole chromosomes but also for specific genes, based on the SNPs present. Haplotypes in the latter context represent the different alleles of a gene. The computational approach to SNP mining is becoming increasingly popular because of the continuously increasing number of sequences deposited in databases, which allows a more accurate identification of SNPs. Several software packages have been developed for SNP mining from databases. From these, QualitySNP is the only tool that combines SNP detection with the reconstruction of alleles, which results in a lower number of false positive SNPs and also works much faster than other programs. We have build a web-based SNP discovery and allele detection tool (HaploSNPer) based on QualitySNP.ResultsHaploSNPer is a flexible web-based tool for detecting SNPs and alleles in user-specified input sequences from both diploid and polyploid species. It includes BLAST for finding homologous sequences in public EST databases, CAP3 or PHRAP for aligning them, and QualitySNP for discovering reliable allelic sequences and SNPs. All possible and reliable alleles are detected by a mathematical algorithm using potential SNP information. Reliable SNPs are then identified based on the reconstructed alleles and on sequence redundancy.ConclusionThorough testing of HaploSNPer (and the underlying QualitySNP algorithm) has shown that EST information alone is sufficient for the identification of alleles and that reliable SNPs can be found efficiently. Furthermore, HaploSNPer supplies a user friendly interface for visualization of SNP and alleles. HaploSNPer is available from http://www.bioinformatics.nl/tools/haplosnper/.

[1]  Allan Booth,et al.  A comparison of sequence-based polymorphism and haplotype content in transcribed and anonymous regions of the barley genome. , 2004, Genome.

[2]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[3]  A. Syvänen Accessing genetic variation: genotyping single nucleotide polymorphisms , 2001, Nature Reviews Genetics.

[4]  Jack A. M. Leunissen,et al.  QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species , 2006, BMC Bioinformatics.

[5]  C. van Broeckhoven,et al.  novoSNP, a novel computational tool for sequence variation discovery. , 2005, Genome research.

[6]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[7]  M. Mitreva,et al.  Alpha-gliadin genes from the A, B, and D genomes of wheat contain different sets of celiac disease epitopes , 2006, BMC Genomics.

[8]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[9]  Timothy A. Erwin,et al.  SNPServer: a real-time SNP discovery tool , 2005, Nucleic Acids Res..

[10]  Philippe Chaumeil,et al.  Automated SNP Detection in Expressed Sequence Tags: Statistical Considerations and Application to Maritime Pine Sequences , 2004, Plant Molecular Biology.

[11]  F. Salamini,et al.  SNP frequency and allelic haplotype structure of Beta vulgaris expressed genes , 2001, Molecular Breeding.

[12]  A. Rafalski Applications of single nucleotide polymorphisms in crop genetics. , 2002, Current opinion in plant biology.

[13]  T. Ideker,et al.  Mining SNPs from EST databases. , 1999, Genome research.

[14]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[15]  Inge Jonassen,et al.  A graph based algorithm for generating EST consensus sequences , 2005, Bioinform..

[16]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[17]  Eran Halperin,et al.  Haplotype reconstruction from genotype data using Imperfect Phylogeny , 2004, Bioinform..

[18]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[19]  Michael N. Edmonson,et al.  Reliable identification of large numbers of candidate SNPs from public EST data , 1999, Nature Genetics.

[20]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[21]  David Edwards,et al.  Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP , 2003, Bioinform..