Prediction of protein allergenicity using local description of amino acid sequence.

The constant increase in atopic allergy and other hypersensitivity reactions has intensified the need for successful therapeutic approaches. Existing bioinformatic tools for predicting allergenic potential are primarily based on sequence similarity searches along the entire protein sequence and do not address the dual issues of conformational and overlapping B-cell epitope recognition sites. In this study, we report AllerPred, a computational system that is capable of capturing multiple overlapping continuous and discontinuous B-cell epitope binding patterns in allergenic proteins using SVM as its prediction engine. A novel representation of local protein sequence descriptors enables the system to model multiple overlapping continuous and discontinuous B-cell epitope binding patterns within a protein sequence. The model was rigorously trained and tested using 669 IUIS allergens and 1237 non-allergens. Testing results showed that the area under the receiver operating curve (AROC) of SVM models is 0.81 with 76 percent sensitivity at specificity of 76 percent . This approach consistently outperforms existing allergenicity prediction systems using a standardized testing dataset of experimentally validated allergens and non-allergen sequences.

[1]  Ping Song,et al.  The value of short amino acid sequence matches for prediction of protein allergenicity. , 2006, Toxicological sciences : an official journal of the Society of Toxicology.

[2]  V. Kurup,et al.  C-Terminal Cysteine Residues Determine the IgE Binding of Aspergillus fumigatus Allergen Asp f 21 , 2002, The Journal of Immunology.

[3]  J. Thornton,et al.  Continuous and discontinuous protein antigenic determinants , 1986, Nature.

[4]  A. Mori,et al.  Determination of the N- and C-terminal sequences required to bind human IgE of the major house dust mite allergen Der f 2 and epitope mapping for monoclonal antibodies. , 1997, Molecular immunology.

[5]  Mats G. Gustafsson,et al.  Prediction of food protein allergenicity: a bioinformatic learning systems approach , 2002, Silico Biol..

[6]  S. Gendel,et al.  Sequence Analysis for Assessing Potential Allergenicity , 2002, Annals of the New York Academy of Sciences.

[7]  Irini A. Doytchinova,et al.  Towards the in silico identification of class II restricted T-cell epitopes: a partial least squares iterative self-consistent algorithm for affinity prediction , 2003, Bioinform..

[8]  Walter Keller,et al.  Molecular characterization of recombinant T1, a non-allergenic periwinkle (Catharanthus roseus) protein, with sequence similarity to the Bet v 1 plant allergen family. , 2003, The Biochemical journal.

[9]  J. Hunyadi,et al.  Association between the occurrence of the anticardiolipin IgM and mite allergen‐specific IgE antibodies in children with extrinsic type of atopic eczema/dermatitis syndrome , 2004, Allergy.

[10]  David Fear,et al.  The biology of IGE and the basis of allergic disease. , 2001, Annual review of immunology.

[11]  J. M. Zimmerman,et al.  The characterization of amino acid sequences in proteins by statistical methods. , 1968, Journal of theoretical biology.

[12]  S M Gendel,et al.  The use of amino acid sequence alignments to assess potential allergenicity of proteins used in genetically modified foods. , 1998, Advances in food and nutrition research.

[13]  Rapoport,et al.  Relationship between autoantibody epitopic recognition and immunoglobulin gene usage , 1998, Clinical and experimental immunology.

[14]  Y. Mine,et al.  Reduction of antigenicity and allergenicity of genetically modified egg white allergen, ovomucoid third domain. , 2003, Biochemical and biophysical research communications.

[15]  Xavier Llorà,et al.  Automated alphabet reduction method with evolutionary algorithms for protein structure prediction , 2007, GECCO '07.

[16]  Michael B. Stadler,et al.  Allergenicity prediction by protein sequence , 2003, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[17]  M. Charton,et al.  The structural dependence of amino acid hydrophobicity parameters. , 1982, Journal of theoretical biology.

[18]  N. Nieuwenhuizen,et al.  Fighting Food Allergy: Current Approaches , 2005, Annals of the New York Academy of Sciences.

[19]  Roeland C. H. J. van Ham,et al.  Allermatch™, a webtool for the prediction of potential allergenicity according to current FAO/WHO Codex alimentarius guidelines , 2004, BMC Bioinformatics.

[20]  I Kimber,et al.  Evaluation of protein allergenic potential in mice: dose–response analyses , 2003, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.

[21]  L. Kier,et al.  Amino acid side chain parameters for correlation studies in biology and pharmacology. , 2009, International journal of peptide and protein research.

[22]  Tin Wee Tan,et al.  Structural bioinformatics Prediction of HLA-DQ 3 . 2 b Ligands : evidence of multiple registers in class II binding peptides , 2006 .

[23]  Daniel Soeria-Atmadja,et al.  Supervised identification of allergen-representative peptides for in silico detection of potentially allergenic proteins , 2005, Bioinform..

[24]  S. Chakraborty,et al.  Increased nutritive value of transgenic potato by expressing a nonallergenic seed albumin gene from Amaranthus hypochondriacus. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[26]  P. Argos,et al.  Suggestions for "safe" residue substitutions in site-directed mutagenesis. , 1991, Journal of molecular biology.

[27]  Arun Krishnan,et al.  Predicting allergenic proteins using wavelet transform , 2004, Bioinform..

[28]  R. Riganò,et al.  Molecular and immunological characterization of the C‐terminal region of a new Echinococcus granulosus Heat Shock Protein 70 , 2003, Parasite immunology.

[29]  Z. Cao,et al.  Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties. , 2007, Molecular immunology.

[30]  Joo Chuan Tong,et al.  AllerTool: a web server for predicting allergenicity and allergic cross-reactivity in proteins , 2007, Bioinform..

[31]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[32]  Y A Mekori,et al.  Introduction to allergic diseases. , 1996, Critical reviews in food science and nutrition.

[33]  W. Thomas,et al.  Non‐allergenic antigen in allergic sensitization: responses to the mite ferritin heavy chain antigen by allergic and non‐allergic subjects , 2002, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology.

[34]  I. Muchnik,et al.  Prediction of protein folding class using global description of amino acid sequence. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[35]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[36]  K. Cornish,et al.  Absence of cross-reactivity of IgE antibodies from subjects allergic to Hevea brasiliensis latex with a new source of natural rubber latex from guayule (Parthenium argentatum). , 1996, The Journal of allergy and clinical immunology.

[37]  Gajendra P. S. Raghava,et al.  AlgPred: prediction of allergenic proteins and mapping of IgE epitopes , 2006, Nucleic Acids Res..

[38]  K. MacDonald,et al.  Construction of recombinant targeting immunogens incorporating an HIV-1 neutralizing epitope into sites of differing conformational constraint. , 2002, Vaccine.

[39]  A. Silvanovich,et al.  Bioinformatic Methods for Allergenicity Assessment Using a Comprehensive Allergen Database , 2002, International Archives of Allergy and Immunology.

[40]  B. J. Sutton,et al.  The human IgE network , 1993, Nature.

[41]  I Kimber,et al.  Determination of protein allergenicity: studies in mice. , 2001, Toxicology letters.

[42]  D. Soeria-Atmadja,et al.  Statistical Evaluation of Local Alignment Features Predicting Allergenicity Using Supervised Classification Algorithms , 2004, International Archives of Allergy and Immunology.