Stepwise Threshold Clustering: A New Method for Genotyping MHC Loci Using Next-Generation Sequencing Technology

Genes of the vertebrate major histocompatibility complex (MHC) are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS) technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms.Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1) a “gray zone” where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2) a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci – Stepwise Threshold Clustering (STC) – that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus) samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.

[1]  B. Clarke,et al.  Maintenance of Histocompatibility Polymorphisms , 1966, Nature.

[2]  G. Snell The H-2 locus of the mouse: observations and speculations concerning its comparative genetics and its polymorphism. , 1968, Folia biologica.

[3]  J. Klein Natural history of the major histocompatibility complex , 1986 .

[4]  M. Nei,et al.  Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection , 1988, Nature.

[5]  M. Nei,et al.  Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[6]  T. Sekiya,et al.  Detection of polymorphisms of human DNA by gel electrophoresis as single-strand conformation polymorphisms. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[7]  W. Potts,et al.  Evolution of diversity at the major histocompatibility complex. , 1990, Trends in ecology & evolution.

[8]  M. Nei,et al.  Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. , 1990, Genetics.

[9]  R. Slade,et al.  Overdominant vs. frequency-dependent selection at MHC loci. , 1992, Genetics.

[10]  A. Uitterlinden,et al.  Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA , 1993, Applied and environmental microbiology.

[11]  K. R. Clarke,et al.  Non‐parametric multivariate analyses of changes in community structure , 1993 .

[12]  M. Tilanus,et al.  Automated, solid-phase sequencing of DRB region genes using T7 sequencing chemistry and dye-labeled primers. , 1995, Tissue antigens.

[13]  J. Monaco,et al.  Characterization of polymorphism within the H2-M MHC class II loci. , 1995, Immunogenetics.

[14]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[15]  T. Ohta,et al.  Population Biology of Antigen Presentation by MHC Class I Molecules , 1996, Science.

[16]  D. Hillis,et al.  Recombinant DNA sequences generated by PCR amplification. , 1997, Molecular biology and evolution.

[17]  O Gascuel,et al.  BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. , 1997, Molecular biology and evolution.

[18]  P. Hedrick,et al.  Evolution and ecology of MHC molecules: from genomics to sexual selection. , 1998, Trends in ecology & evolution.

[19]  H. Sültmann,et al.  Linkage relationships and haplotype polymorphism among cichlid Mhc class II B loci. , 1998, Genetics.

[20]  J. Klein,et al.  MOLECULAR TRANS-SPECIES POLYMORPHISM , 1998 .

[21]  A. Little,et al.  High resolution HLA class I typing by reference strand mediated conformation analysis (RSCA). , 1998, Tissue antigens.

[22]  J. Klein,et al.  Cloning of major histocompatibility complex (Mhc) genes from threespine stickleback, Gasterosteus aculeatus. , 1998, Molecular marine biology and biotechnology.

[23]  T. Ohta Effect of gene conversion on polymorphic patterns at major histocompatibility complex loci , 1999, Immunological reviews.

[24]  W. Potts,et al.  The Evolution of Mating Preferences and Major Histocompatibility Complex Genes , 1999, The American Naturalist.

[25]  MAJOR HISTOCOMPATIBILITY COMPLEX VARIATION IN THE ARABIAN ORYX , 2000, Evolution; international journal of organic evolution.

[26]  Marti J. Anderson,et al.  A new method for non-parametric multivariate analysis of variance in ecology , 2001 .

[27]  M. Flajnik,et al.  Comparative genomics of the MHC: glimpses into the evolution of the adaptive immune system. , 2001, Immunity.

[28]  M. Milinski,et al.  Female sticklebacks count alleles in a strategy of sexual selection explaining MHC polymorphism , 2001, Nature.

[29]  G. Thomson,et al.  How selection shapes variation of the human major histocompatibility complex: a review , 2001, Annals of human genetics.

[30]  G. Damiani,et al.  Recombinant DRB sequences produced by mismatch repair of heteroduplexes during cloning in Escherichia coli. , 2002, European journal of immunogenetics : official journal of the British Society for Histocompatibility and Immunogenetics.

[31]  PATHOGEN RESISTANCE AND GENETIC VARIATION AT MHC LOCI , 2002, Evolution; international journal of organic evolution.

[32]  C. Hess,et al.  The Evolution of the Major Histocompatibility Complex in Birds , 2002 .

[33]  Takahiro Kanagawa,et al.  Bias and artifacts in multitemplate polymerase chain reactions (PCR). , 2003, Journal of bioscience and bioengineering.

[34]  M. Schierup,et al.  Relative roles of mutation and recombination in generating allelic polymorphism at an MHC class II locus in Peromyscus maniculatus. , 2003, Genetical research.

[35]  C. Landry,et al.  MHC studies in nonmodel vertebrates: what have we learned about natural selection in 15 years? , 2003, Journal of evolutionary biology.

[36]  K. M. Wegner,et al.  Multiple parasites are driving major histocompatibility complex polymorphism in the wild , 2003, Journal of evolutionary biology.

[37]  Thomas Huber,et al.  Bellerophon: a program to detect chimeric sequences in multiple sequence alignments , 2004, Bioinform..

[38]  Sue Povey,et al.  Gene map of the extended human MHC , 2004, Nature Reviews Genetics.

[39]  K. M. Wegner,et al.  Recent duplication and inter-locus gene conversion in major histocompatibility class II genes in a teleost, the three-spined stickleback , 2004, Immunogenetics.

[40]  T. Reusch,et al.  Inter- and Intralocus Recombination Drive MHC Class IIB Gene Diversification in a Teleost, the Three-Spined Stickleback Gasterosteus aculeatus , 2005, Journal of Molecular Evolution.

[41]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[42]  M. Milinski,et al.  Major histocompatibility complex diversity influences parasite resistance and innate immunity in sticklebacks , 2004, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[43]  J. Trowsdale,et al.  Comparative genomics of major histocompatibility complexes , 2004, Immunogenetics.

[44]  M. Galan,et al.  Analysis of major histocompatibility complex class II gene in water voles using capillary electrophoresis‐single stranded conformation polymorphism , 2005 .

[45]  S. Ewald,et al.  High-resolution typing for chicken BF2 (MHC class I) alleles by automated sequencing. , 2005, Animal genetics.

[46]  Manfred Milinski,et al.  The Major Histocompatibility Complex, Sexual Selection, and Mate Choice , 2006 .

[47]  Katherine Belov,et al.  Transmission of a fatal clonal tumor by biting occurs due to depleted MHC diversity in a threatened carnivorous marsupial , 2007, Proceedings of the National Academy of Sciences.

[48]  Jonathan P. Bollback,et al.  The Use of Coded PCR Primers Enables High-Throughput Sequencing of Multiple Homolog Amplification Products by 454 Parallel Sequencing , 2007, PloS one.

[49]  Susan M. Huse,et al.  Accuracy and quality of massively parallel DNA pyrosequencing , 2007, Genome Biology.

[50]  C. Eizaguirre,et al.  RSCA genotyping of MHC for high-throughput evolutionary studies in the model organism three-spined stickleback Gasterosteus aculeatus , 2009, BMC Evolutionary Biology.

[51]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[52]  S. Altizer,et al.  Host-pathogen evolution, biodiversity and disease risk for natural populations , 2008 .

[53]  U. Stenzel,et al.  Parallel tagged sequencing on the 454 platform , 2008, Nature Protocols.

[54]  T. Lenz,et al.  Simple approach to reduce PCR artefact formation leads to reliable genotyping of MHC and other highly polymorphic loci--implications for evolutionary analysis. , 2008, Gene.

[55]  Andrew J. DeWoody,et al.  Inferring Population History and Demography Using Microsatellites, Mitochondrial DNA, and Major Histocompatibility Complex (MHC) Genes , 2008, Evolution; international journal of organic evolution.

[56]  C. Fox,et al.  Conservation biology : evolution in action , 2008 .

[57]  P. Taberlet,et al.  New generation sequencers as a tool for genotyping of highly polymorphic multilocus MHC system , 2009, Molecular ecology resources.

[58]  Volker Roth,et al.  Deep Sequencing of a Genetically Heterogeneous Sample: Local Haplotype Reconstruction and Read Error Correction , 2009, RECOMB.

[59]  A. Hendry,et al.  Variable Progress Toward Ecological Speciation in Parapatry: Stickleback Across Eight Lake-Stream Transitions , 2009, Evolution; international journal of organic evolution.

[60]  Gilles Caraux,et al.  A 454 multiplex sequencing method for rapid and reliable genotyping of highly polymorphic genes in large-scale studies , 2010, BMC Genomics.

[61]  Niko Beerenwinkel,et al.  Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies , 2010, Nucleic acids research.

[62]  Lewis G. Spurgin,et al.  How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings , 2010, Proceedings of the Royal Society B: Biological Sciences.

[63]  W. Babik Methods for MHC genotyping in non‐model vertebrates , 2010, Molecular ecology resources.

[64]  N. Lennon,et al.  Next-generation sequencing for HLA typing of class I loci , 2011, BMC Genomics.

[65]  M. Salemi,et al.  The Threshold Bootstrap Clustering: A New Approach to Find Families or Transmission Clusters within Molecular Quasispecies , 2010, PloS one.

[66]  Menna E. Jones,et al.  MHC gene copy number variation in Tasmanian devils: implications for the spread of a contagious cancer , 2010, Proceedings of the Royal Society B: Biological Sciences.

[67]  L. Gustafsson,et al.  454 sequencing reveals extreme complexity of the class II Major Histocompatibility Complex in the collared flycatcher , 2010, BMC Evolutionary Biology.

[68]  Rob Knight,et al.  UCHIME improves sensitivity and speed of chimera detection , 2011, Bioinform..

[69]  B. Haas,et al.  Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. , 2011, Genome research.

[70]  K. Zamudio,et al.  MHC genotypes associate with resistance to a frog-killing fungus , 2011, Proceedings of the National Academy of Sciences.

[71]  Lewis G. Spurgin,et al.  Gene conversion rapidly generates major histocompatibility complex diversity in recently founded bird populations , 2011, Molecular ecology.

[72]  W. Babik,et al.  jMHC: software assistant for multilocus genotyping of gene families using next‐generation amplicon sequencing , 2011, Molecular ecology resources.

[73]  D. Warton,et al.  Distance‐based multivariate analyses confound location and dispersion effects , 2012 .

[74]  S. Krishnakumar,et al.  High-throughput, high-fidelity HLA genotyping with deep sequencing , 2012, Proceedings of the National Academy of Sciences.

[75]  P. Taberlet,et al.  Major histocompatibility complex class II compatibility, but not class I, predicts mate choice in a bird with highly developed olfaction , 2012, Proceedings of the Royal Society B: Biological Sciences.

[76]  Alex A. Pollen,et al.  The genomic basis of adaptive evolution in threespine sticklebacks , 2012, Nature.

[77]  W. Babik,et al.  Interspecific hybridization increases MHC class II diversity in two sister species of newts , 2012, Molecular ecology.

[78]  W. Babik,et al.  Evaluation of two approaches to genotyping major histocompatibility complex class I in a passerine—CE‐SSCP and 454 pyrosequencing , 2012, Molecular ecology resources.

[79]  Yi Wang,et al.  mvabund– an R package for model‐based analysis of multivariate abundance data , 2012 .

[80]  L. Gustafsson,et al.  MHC diversity, malaria and lifetime reproductive success in collared flycatchers , 2012, Molecular ecology.

[81]  L. A. Whittingham,et al.  MHC VARIATION IS RELATED TO A SEXUALLY SELECTED ORNAMENT, SURVIVAL, AND PARASITE RESISTANCE IN COMMON YELLOWTHROATS , 2013, Evolution; international journal of organic evolution.

[82]  B. Koop,et al.  Comprehensive analysis of MHC class II genes in teleost fish genomes reveals dispensability of the peptide-loading DM system in a large part of vertebrates , 2013, BMC Evolutionary Biology.

[83]  L. Bernatchez,et al.  Nonparallelism in MHCIIβ diversity accompanies nonparallelism in pathogen infection of lake whitefish (Coregonus clupeaformis) species pairs as revealed by next‐generation sequencing , 2013, Molecular ecology.

[84]  C. Eizaguirre,et al.  EVALUATING PATTERNS OF CONVERGENT EVOLUTION AND TRANS‐SPECIES POLYMORPHISM AT MHC IMMUNOGENES IN TWO SYMPATRIC STICKLEBACK SPECIES , 2013, Evolution; international journal of organic evolution.

[85]  T. Albrecht,et al.  MHC Class IIB Exon 2 Polymorphism in the Grey Partridge (Perdix perdix) Is Shaped by Selection, Recombination and Gene Conversion , 2013, PloS one.

[86]  C. Mazzoni,et al.  MHC genotyping of non-model organisms using next-generation sequencing: a new methodology to deal with artefacts and allelic dropout , 2013, BMC Genomics.

[87]  Melanie Gibbs,et al.  Unscrambling butterfly oogenesis , 2013, BMC Genomics.

[88]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .