Identification and analysis of error types in high-throughput genotyping.

Although it is clear that errors in genotyping data can lead to severe errors in linkage analysis, there is as yet no consensus strategy for identification of genotyping errors. Strategies include comparison of duplicate samples, independent calling of alleles, and Mendelian-inheritance-error checking. This study aimed to develop a better understanding of error types associated with microsatellite genotyping, as a first step toward development of a rational error-detection strategy. Two microsatellite marker sets (a commercial genomewide set and a custom-designed fine-resolution mapping set) were used to generate 118,420 and 22,500 initial genotypes and 10,088 and 8,328 duplicates, respectively. Mendelian-inheritance errors were identified by PedManager software, and concordance was determined for the duplicate samples. Concordance checking identifies only human errors, whereas Mendelian-inheritance-error checking is capable of detection of additional errors, such as mutations and null alleles. Neither strategy is able to detect all errors. Inheritance checking of the commercial marker data identified that the results contained 0.13% human errors and 0.12% other errors (0.25% total error), whereas concordance checking found 0.16% human errors. Similarly, Mendelian-inheritance-error checking of the custom-set data identified 1.37% errors, compared with 2.38% human errors identified by concordance checking. A greater variety of error types were detected by Mendelian-inheritance-error checking than by duplication of samples or by independent reanalysis of gels. These data suggest that Mendelian-inheritance-error checking is a worthwhile strategy for both types of genotyping data, whereas fine-mapping studies benefit more from concordance checking than do studies using commercial marker data. Maximization of error identification increases the likelihood of linkage when complex diseases are analyzed.

[1]  J R O'Connell,et al.  PedCheck: a program for identification of genotype incompatibilities in linkage analysis. , 1998, American journal of human genetics.

[2]  L. Siever,et al.  Genome scan of schizophrenia. , 1998, The American journal of psychiatry.

[3]  F. Salvatore,et al.  Efficiency of two different nine-loci short tandem repeat systems for DNA typing purposes. , 1999, Clinical chemistry.

[4]  Joseph B. Rayman,et al.  Methods for precise sizing, automated binning of alleles, and reduction of error rates in large-scale genotyping using fluorescently labeled dinucleotide markers. FUSION (Finland-U.S. Investigation of NIDDM Genetics) Study Group. , 1997, Genome research.

[5]  K. Klinger,et al.  Genetic and physical mapping of the Treacher Collins syndrome locus: refinement of the localization to chromosome 5q32-33.2. , 1992, Human molecular genetics.

[6]  K. Herold,et al.  DNA typing of human dandruff. , 1998, Journal of forensic sciences.

[7]  J. Carpten,et al.  Modulation of non-templated nucleotide addition by Taq DNA polymerase: primer modifications that facilitate genotyping. , 1996, BioTechniques.

[8]  T P Speed,et al.  The effects of genotyping errors and interference on estimation of genetic distance. , 1997, Human heredity.

[9]  K Lange,et al.  A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data. , 2000, American journal of human genetics.

[10]  D. Queller,et al.  Detection of highly polymorphic microsatellite loci in a species with little allozyme polymorphism , 1993, Molecular ecology.

[11]  T. Speed,et al.  Chromosomes X, 9, and the H2 locus interact epistatically to control Leishmania major infection , 1999, European journal of immunology.

[12]  D. Tautz Hypervariability of simple sequences as a general source for polymorphic DNA markers. , 1989, Nucleic acids research.

[13]  R. Wayne,et al.  Genetic variation of microsatellite loci in a bottlenecked species: the northern hairy‐nosed wombat Lasiorhinus krefftii , 1994, Molecular ecology.

[14]  T. Petes,et al.  Instability of simple sequence DNA in Saccharomyces cerevisiae , 1992, Molecular and cellular biology.

[15]  M. Georges,et al.  Microsatellite mapping of the gene causing weaver disease in cattle will allow the study of an associated quantitative trait locus. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[16]  H Gudbjartsson,et al.  Using quality measures to facilitate allele calling in high-throughput genotyping. , 1999, Genome research.

[17]  E S Lander,et al.  Systematic detection of errors in genetic linkage data. , 1992, Genomics.

[18]  L. Cavalli-Sforza,et al.  High resolution of human evolutionary trees with polymorphic microsatellites , 1994, Nature.