Definition of supertypes for HLA molecules using clustering of specificity matrices

Major histocompatibility complex (MHC) proteins are encoded by extremely polymorphic genes and play a crucial role in immunity. However, not all genetically different MHC molecules are functionally different. Sette and Sidney (1999) have defined nine HLA class I supertypes and showed that with only nine main functional binding specificities it is possible to cover the binding properties of almost all known HLA class I molecules. Here we present a comprehensive study of the functional relationship between all HLA molecules with known specificities in a uniform and automated way. We have developed a novel method for clustering sequence motifs. We construct hidden Markov models for HLA class I molecules using a Gibbs sampling procedure and use the similarities among these to define clusters of specificities. These clusters are extensions of the previously suggested ones. We suggest splitting some of the alleles in the A1 supertype into a new A26 supertype, and some of the alleles in the B27 supertype into a new B39 supertype. Furthermore the B8 alleles may define their own supertype. We also use the published specificities for a number of HLA-DR types to define clusters with similar specificities. We report that the previously observed specificities of these class II molecules can be clustered into nine classes, which only partly correspond to the serological classification. We show that classification of HLA molecules may be done in a uniform and automated way. The definition of clusters allows for selection of representative HLA molecules that can cover the HLA specificity space better. This makes it possible to target most of the known HLA alleles with known specificities using only a few peptides, and may be used in construction of vaccines. Supplementary material is available at http://www.cbs.dtu.dk/researchgroups/immunology/supertypes.html.

[1]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[2]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[3]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[4]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[5]  Z. Nagy,et al.  Precise prediction of major histocompatibility complex class II-peptide interaction based on peptide side chain scanning , 1994, The Journal of experimental medicine.

[6]  S. Henikoff,et al.  Position-based sequence weights. , 1994, Journal of molecular biology.

[7]  G Hermanson,et al.  Binding of a peptide antigen to multiple HLA alleles allows definition of an A2-like supertype. , 1995, Journal of immunology.

[8]  M F del Guercio,et al.  Several HLA alleles share overlapping peptide specificities. , 1995, Journal of immunology.

[9]  M F del Guercio,et al.  Definition of an HLA-A3-like supermotif demonstrates the overlapping peptide-binding repertoires of common HLA molecules. , 1996, Human immunology.

[10]  C DeLisi,et al.  HLA allele selection for designing peptide vaccines. , 1996, Genetic analysis : biomolecular engineering.

[11]  F. Sinigaglia,et al.  HLA class II peptide binding specificity and autoimmunity. , 1997, Advances in immunology.

[12]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[13]  Gapped BLAST and PSI-BLAST: A new , 1997 .

[14]  M F del Guercio,et al.  Several common HLA-DR types share largely overlapping peptide binding repertoires. , 1998, Journal of immunology.

[15]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[16]  Fernando Rodriguez,et al.  DNA Immunization with Minigenes: Low Frequency of Memory Cytotoxic T Lymphocytes and Inefficient Antiviral Protection Are Rectified by Ubiquitination , 1998, Journal of Virology.

[17]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[18]  U. Şahin,et al.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices , 1999, Nature Biotechnology.

[19]  G Y Ishioka,et al.  Utilization of MHC class I transgenic mice for development of minigene DNA vaccines encoding multiple HLA-restricted CTL epitopes. , 1999, Journal of immunology.

[20]  H. Rammensee,et al.  SYFPEITHI: database for MHC ligands and peptide motifs , 1999, Immunogenetics.

[21]  Christian N. S. Pedersen,et al.  Metrics and Similarity Measures for Hidden Markov Models , 1999, ISMB.

[22]  J. Sidney,et al.  Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism , 1999, Immunogenetics.

[23]  Gajendra P. S. Raghava,et al.  ProPred: prediction of HLA-DR binding sites , 2001, Bioinform..

[24]  S L Lauemøller,et al.  Establishment of a quantitative ELISA capable of determining peptide - MHC class I interaction. , 2002, Tissue antigens.

[25]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[26]  Hans-Georg Rammensee,et al.  MHC ligands and peptide motifs: first listing , 2004, Immunogenetics.