Evaluation of two methods for computational HLA haplotypes inference using a real dataset

BackgroundHLA haplotype analysis has been used in population genetics and in the investigation of disease-susceptibility locus, due to its high polymorphism. Several methods for inferring haplotype genotypic data have been proposed, but it is unclear how accurate each of the methods is or which method is superior. The accuracy of two of the leading methods of computational haplotype inference – Expectation-Maximization algorithm based (implemented in Arlequin V3.0) and Bayesian algorithm based (implemented in PHASE V2.1.1) – was compared using a set of 122 HLA haplotypes (A-B-Cw-DQB1-DRB1) determined through direct counting. The accuracy was measured with the Mean Squared Error (MSE), Similarity Index (IF) and Haplotype Identification Index (IH).ResultsNone of the methods inferred all of the known haplotypes and some differences were observed in the accuracy of the two methods in terms of both haplotype determination and haplotype frequencies estimation. Working with haplotypes composed by low polymorphic sites, present in more than one individual, increased the confidence in the assignment of haplotypes and in the estimation of the haplotype frequencies generated by both programs.ConclusionThe PHASE v2.1.1 implemented method had the best overall performance both in haplotype construction and frequency calculation, although the differences between the two methods were insubstantial. To our knowledge this was the first work aiming to test statistical methods using real haplotypic data from the HLA region.

[1]  J. Samarut,et al.  A nontoxic and versatile protein salting-out method for isolation of DNA. , 1994, BioTechniques.

[2]  Peter Donnelly,et al.  A comparison of bayesian methods for haplotype reconstruction from population genotype data. , 2003, American journal of human genetics.

[3]  Stefan Schneider,et al.  Arlequin (version 3.0): An integrated software package for population genetics data analysis , 2005 .

[4]  P. Travers,et al.  The Structure of the Major Histocompatibility Complex and its Molecular Interactions , 2000 .

[5]  Tianhua Niu,et al.  A coalescence-guided hierarchical Bayesian method for haplotype inference. , 2006, American journal of human genetics.

[6]  Dmitri V. Zaykin,et al.  Effectiveness of computational methods in haplotype prediction , 2002, Human Genetics.

[7]  N. Schork,et al.  Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. , 2000, American journal of human genetics.

[8]  Lisa J. Martin,et al.  Accuracy of haplotype estimation in a region of low linkage disequilibrium , 2005, BMC Genetics.

[9]  R. Lewontin The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models. , 1964, Genetics.

[10]  M. Boehnke,et al.  Experimentally-derived haplotypes substantially increase the efficiency of linkage disequilibrium studies , 2001, Nature Genetics.

[11]  R. Adkins,et al.  Comparison of the accuracy of methods of computational haplotype inference using a large empirical dataset , 2004, BMC Genetics.

[12]  A. Chakravarti,et al.  Haplotype inference in random population samples. , 2002, American journal of human genetics.

[13]  W. Klitz,et al.  Polymorphism, recombination, and linkage disequilibrium within the HLA class II region. , 1992, Journal of immunology.

[14]  François Rousset,et al.  GENEPOP (version 1.2): population genetic software for exact tests and ecumenicism , 1995 .

[15]  Wiklund Ra,et al.  First of two parts , 1997 .

[16]  T. Niu Algorithms for inferring haplotypes , 2004, Genetic epidemiology.

[17]  M. Stephens,et al.  Accounting for Decay of Linkage Disequilibrium in Haplotype Inference and Missing-data Imputation , 2022 .

[18]  K K Kidd,et al.  Comparisons of two methods for haplotype reconstruction and haplotype frequency estimation from population data. , 2001, American journal of human genetics.

[19]  Dana C Crawford,et al.  Definition and clinical importance of haplotypes. , 2005, Annual review of medicine.

[20]  S. Cohen Histocompatibility Testing 1970 , 1971 .

[21]  P. Donnelly,et al.  A new statistical method for haplotype reconstruction from population data. , 2001, American journal of human genetics.

[22]  J. Trowsdale HLA genomics in the third millennium. , 2005, Current opinion in immunology.

[23]  O. Olerup,et al.  HLA-DR typing by PCR amplification with sequence-specific primers (PCR-SSP) in 2 hours: an alternative to serological DR typing in clinical practice including donor-recipient matching in cadaveric transplantation. , 1992, Tissue antigens.

[24]  C Dosne Pasqualini,et al.  The HLA system. , 1979, Medicina.

[25]  James R. Eshleman,et al.  Conversion of diploidy to haploidy , 2000, Nature.

[26]  L. Excoffier,et al.  Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. , 1995, Molecular biology and evolution.

[27]  Zhaohui S. Qin,et al.  Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. , 2002, American journal of human genetics.

[28]  J. Klein,et al.  The HLA system. First of two parts. , 2000, The New England journal of medicine.

[29]  Robert I. Lechler,et al.  HLA in health and disease , 2000 .