Using quality measures to facilitate allele calling in high-throughput genotyping.

Currently, the main limitation in high-throughput microsatellite genotyping is the required manual editing of allele calls. Even though programs for automated allele calling have been available for several years, they have limited capability because accurate data could only be assured by manual inspection of the electropherograms for confirmation. Here we describe the development of a parametric approach to allele call quality control that eliminates much of the time required for manual editing of the data. This approach was implemented in an editing tool, Decode-GT, that works downstream of the allele calling program, TrueAllele (TA). Decode-GT reads the output data from TA, displays the underlying electropherograms for the genotypes, and sorts the allele calls into three categories: good, bad, and ambiguous. It discards the bad calls, accepts the good calls, and suggests that the user inspect the ambiguous calls, thereby reducing dependence on manual editing. For the categorization we use the following parameters: (1) the quality value for each allele call from TrueAllele; (2) the peak height of the alleles; and (3) the size of the peak shift needed to move peaks into the nearest bin. Here we report how we optimized the parameters such that the size of the ambiguous category was minimized, and both the number of miscalled genotypes in the good category and the useable genotypes in the bad category were negligible. This approach reduces the manual editing time and results in <1% miscalls.

[1]  N J Cox,et al.  Allele-sharing models: LOD scores and accurate linkage tests. , 1997, American journal of human genetics.

[2]  M W Perlin,et al.  Toward fully automated genotyping: allele assignment, pedigree construction, phase determination, and recombination detection in Duchenne muscular dystrophy. , 1994, American journal of human genetics.

[3]  L. Kruglyak Prospects for whole-genome linkage disequilibrium mapping of common disease genes , 1999, Nature Genetics.

[4]  M W Perlin,et al.  Toward fully automated genotyping: genotyping microsatellite markers by deconvolution. , 1995, American journal of human genetics.

[5]  J. Carpten,et al.  Modulation of non-templated nucleotide addition by Taq DNA polymerase: primer modifications that facilitate genotyping. , 1996, BioTechniques.

[6]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[7]  J. Witte,et al.  Genetic dissection of complex traits. , 1994, Nature genetics.

[8]  A. Roter,et al.  An approach to high-throughput genotyping. , 1996, Genome research.

[9]  L Kruglyak,et al.  Parametric and nonparametric linkage analysis: a unified multipoint approach. , 1996, American journal of human genetics.

[10]  Leonid Kruglyak,et al.  The use of a genetic map of biallelic markers in linkage studies , 1997, Nature Genetics.

[11]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[12]  J. Gulcher,et al.  Population Genomics: Laying the Groundwork for Genetic Disease Modeling and Targeting , 1998, Clinical chemistry and laboratory medicine.