"PolyMin": software for identification of the minimum number of polymorphisms required for haplotype and genotype differentiation

BackgroundAnalysis of allelic variation for relevant genes and monitoring chromosome segment transmission during selection are important approaches in plant breeding and ecology. To minimize the number of required molecular markers for this purpose is crucial due to cost and time constraints. To date, software for identification of the minimum number of required markers has been optimized for human genetics and is only partly matching the needs of plant scientists and breeders. In addition, different software packages with insufficient interoperability need to be combined to extract this information from available allele sequence data, resulting in an error-prone multi-step process of data handling.ResultsPolyMin, a computer program combining the detection of a minimum set of single nucleotide polymorphisms (SNPs) and/or insertions/deletions (INDELs) necessary for allele differentiation with the subsequent genotype differentiation in plant populations has been developed. Its efficiency in finding minimum sets of polymorphisms is comparable to other available program packages.ConclusionA computer program detecting the minimum number of SNPs for haplotype discrimination and subsequent genotype differentiation has been developed, and its performance compared to other relevant software. The main advantages of PolyMin, especially for plant scientists, is the integration of procedures from sequence analysis to polymorphism selection within a single program, including both haplotype and genotype differentiation.

[1]  Paola Sebastiani,et al.  Minimal haplotype tagging , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Yan Shen,et al.  htSNPer1.0: software for haplotype block partition and htSNPs selection , 2005, BMC Bioinformatics.

[3]  P. Gupta,et al.  Linkage disequilibrium and association studies in higher plants: Present status and future prospects , 2005, Plant Molecular Biology.

[4]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[5]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[6]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[7]  T. Asp,et al.  Nucleotide diversity and linkage disequilibrium in 11 expressed resistance candidate genes in Lolium perenne , 2007, BMC Plant Biology.

[8]  John S. Yap,et al.  Haplotyping a Quantitative Trait with a High-Density Map in Experimental Crosses , 2007, PloS one.

[9]  Nicholas J Provart,et al.  CapsID: a web-based tool for developing parsimonious sets of CAPS molecular markers for genotyping , 2006, BMC Genetics.

[10]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[11]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[12]  T. Niu,et al.  A sparse marker extension tree algorithm for selecting the best set of haplotype tagging single nucleotide polymorphisms , 2005, Genetic epidemiology.

[13]  Liisa Holm,et al.  COFFEE: an objective function for multiple sequence alignments , 1998, Bioinform..

[14]  Mark Jung,et al.  SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines , 2002, BMC Genetics.

[15]  Hadar I. Avi-Itzhak,et al.  Selection of Minimum Subsets of Single Nucleotide Polymorphisms to Capture Haplotype Block Diversity , 2003, Pacific Symposium on Biocomputing.

[16]  E. Buckler,et al.  Structure of linkage disequilibrium in plants. , 2003, Annual review of plant biology.

[17]  Alison M. Goate,et al.  The Candidate Gene Approach , 2000, Alcohol research & health : the journal of the National Institute on Alcohol Abuse and Alcoholism.

[18]  E S Buckler,et al.  Structure of linkage disequilibrium and phenotypic associations in the maize genome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Timothy A. Erwin,et al.  SNPServer: a real-time SNP discovery tool , 2005, Nucleic Acids Res..

[20]  Brandon S. Gaut,et al.  Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.) , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Lon R. Cardon,et al.  Efficient selective screening of haplotype tag SNPs , 2003, Bioinform..

[22]  A. Rafalski Applications of single nucleotide polymorphisms in crop genetics. , 2002, Current opinion in plant biology.

[23]  D. Botstein,et al.  Construction of a genetic linkage map in man using restriction fragment length polymorphisms. , 1980, American journal of human genetics.

[24]  V. Lefebvre,et al.  The candidate gene approach in plant genetics: a review , 2001, Molecular Breeding.

[25]  Edward S. Buckler,et al.  Dwarf8 polymorphisms associate with variation in flowering time , 2001, Nature Genetics.

[26]  Robert C. Wolpert,et al.  A Review of the , 1985 .

[27]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.