An Improved Fully Connected Hidden Markov Model for Rational Vaccine Design

Large-scale, in vitro vaccine screening is an expensive and slow process, while rational vaccine design is faster and cheaper. As opposed to the emperical ways to design vaccines in biology laboratories, rational vaccine design models the structure of vaccines with computational approaches. Building an effective predictive computer model requires extensive knowledge of the process or phenomenon being modelled. Given current knowledge about the steps involved in immune system responses, computer models are currently focused on one or two of the most important and best known steps; for example: presentation of antigens by major histo-compatibility complex (MHC) molecules. In this step, the MHC molecule selectively binds to some peptides derived from antigens and then presents them to the T-cell. One current focus in rational vaccine design is prediction of peptides that can be bound by MHC. Theoretically, predicting which peptides bind to a particular MHC molecule involves discovering patterns in known MHC-binding peptides and then searching for peptides which conform to these patterns in some new antigenic protein sequences. According to some previous work, Hidden Markov models (HMMs), a machine learning technique, is one of the most effective approaches for this task. Unfortunately, for computer models like HMMs, the number of the parameters to be determined is larger than the number which can be estimated from available training data. Thus, heuristic approaches have to be developed to determine the parameters. In this research, two heuristic approaches are proposed. The first initializes the HMM transition and emission probability matrices by assigning biological meanings to the states. The second approach tailors the structure of a fully connected HMM (fcHMM) to increase specificity. The effectiveness of these two approaches is tested on two human leukocyte antigens(HLA) alleles, HLA-A*0201 and HLAB*3501. The results indicate that these approaches can improve predictive accuracy. Further, the HMM implementation incorporating the above heuristics can outperform a popular profile HMM (pHMM) program, HMMER, in terms of predictive accuracy.

[1]  Arne Elofsson,et al.  Prediction of MHC class I binding peptides, using SVMHC , 2002, BMC Bioinformatics.

[2]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[3]  F. Young Biochemistry , 1955, The Indian Medical Gazette.

[4]  A. Townsend,et al.  Antigen recognition by class I-restricted T lymphocytes. , 1989, Annual review of immunology.

[5]  Vladimir Brusic,et al.  Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network , 1998, Bioinform..

[6]  J. Sidney,et al.  Prominent role of secondary anchor residues in peptide binding to HLA-A2.1 molecules , 1993, Cell.

[7]  H Mamitsuka,et al.  Predicting peptides that bind to MHC molecules using supervised learning of hidden markov models , 1998, Proteins.

[8]  KharHengChoo,et al.  Recent Applications of Hidden Markov Models in Computational Biology , 2004 .

[9]  D. Flower,et al.  Quantitative approaches to computational vaccinology , 2002, Immunology and cell biology.

[10]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[11]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[12]  Darren R Flower,et al.  Immunoinformatics and the prediction of immunogenicity. , 2002, Applied bioinformatics.

[13]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[14]  M. Kendall Elementary Statistics , 1945, Nature.

[15]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  M. Nei,et al.  Origins and divergence times of mammalian class II MHC gene clusters. , 2000, The Journal of heredity.

[17]  William Arbuthnot Sir Lane,et al.  Specificity and promiscuity among naturally processed peptides bound to HLA-DR alleles , 1993, The Journal of experimental medicine.

[18]  Vladimir Brusic,et al.  Prediction of promiscuous peptides that bind HLA class I molecules , 2002, Immunology and cell biology.

[19]  J. Sacchettini,et al.  Crystal structure of the major histocompatibility complex class I H-2Kb molecule containing a single viral peptide: implications for peptide binding and T-cell receptor recognition. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[20]  M. Hagmann,et al.  Computers Aid Vaccine Design , 2000, Science.

[21]  Hiroshi Mamitsuka,et al.  A Learning Method of Hidden Markov Models for Sequence Discrimination , 1996, J. Comput. Biol..

[22]  平山 令明,et al.  PDB (Protein Data Bank)とその周辺 , 1996 .

[23]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[24]  A. Vitiello,et al.  The relationship between class I binding affinity and immunogenicity of potential cytotoxic T cell epitopes. , 1994, Journal of immunology.

[25]  Naoki Abe,et al.  Prediction of MHC Class I Binding Peptides by a Query Learning Algorithm Based on Hidden Markov Models , 2002, Journal of biological physics.

[26]  T. Schumacher,et al.  Peptide translocation by variants of the transporter associated with antigen processing. , 1993, Science.

[27]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[28]  O. Schueler‐Furman,et al.  Structure‐based prediction of binding peptides to MHC class I molecules: Application to a broad range of MHC alleles , 2000, Protein science : a publication of the Protein Society.

[29]  P. A. Peterson,et al.  Crystal structures of two viral peptides in complex with murine MHC class I H-2Kb. , 1994, Science.

[30]  Don C. Wiley,et al.  Atomic structure of a human MHC molecule presenting an influenza virus peptide , 1992, Nature.

[31]  G. Meek Mathematical statistics with applications , 1973 .

[32]  M. Nei,et al.  Evolution by the birth-and-death process in multigene families of the vertebrate immune system. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[33]  L C Harrison,et al.  MHCPEP: a database of MHC-binding peptides. , 1994, Nucleic acids research.

[34]  Thomas Jackson,et al.  Neural Computing - An Introduction , 1990 .

[35]  A Sette,et al.  Role of HLA-A motifs in identification of potential CTL epitopes in human papillomavirus type 16 E6 and E7 proteins. , 1994, Journal of immunology.

[36]  K. Parker,et al.  Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains. , 1994, Journal of immunology.

[37]  U. Şahin,et al.  Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices , 1999, Nature Biotechnology.

[38]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[39]  Günter J. Hämmerling,et al.  Selectivity of MHC-encoded peptide transporters from human, mouse and rat , 1994, Nature.

[40]  Kun Yu,et al.  Methods for Prediction of Peptide Binding to MHC Molecules: A Comparative Study , 2002, Molecular medicine.

[41]  Vladimir Brusic,et al.  MHCPEP, a database of MHC-binding peptides: update 1996 , 1997, Nucleic Acids Res..

[42]  Guy M. McKhann,et al.  Biochemistry. 3rd edition , 1988, The Yale Journal of Biology and Medicine.

[43]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[44]  M. Nei,et al.  Locus specificity of polymorphic alleles and evolution by a birth-and-death process in mammalian MHC genes. , 1999, Molecular biology and evolution.