The COMPARE Database: A Public Resource for Allergen Identification, Adapted for Continuous Improvement

Motivation: The availability of databases identifying allergenic proteins via a transparent and consensus-based scientific approach is of prime importance to support the safety review of genetically-modified foods and feeds, and public safety in general. Over recent years, screening for potential new allergens sequences has become more complex due to the exponential increase of genomic sequence information. To address these challenges, an international collaborative scientific group coordinated by the Health and Environmental Sciences Institute (HESI), was tasked to develop a contemporary, adaptable, high-throughput process to build the COMprehensive Protein Allergen REsource (COMPARE) database, a publicly accessible allergen sequence data resource along with bioinformatics analytical tools following guidelines of FAO/WHO and CODEX Alimentarius Commission. Results: The COMPARE process is novel in that it involves the identification of candidate sequences via automated keyword-based sorting algorithm and manual curation of the annotated sequence entries retrieved from public protein sequence databases on a yearly basis; its process is meant for continuous improvement, with updates being transparently documented with each version; as a complementary approach, a yearly key-word based search of literature databases is added to identify new allergen sequences that were not (yet) submitted to protein databases; in addition, comments from the independent peer-review panel are posted on the website to increase transparency of decision making; finally, sequence comparison capabilities associated with the COMPARE database was developed to evaluate the potential allergenicity of proteins, based on internationally recognized guidelines, FAO/WHO and CODEX Alimentarius Commission

[1]  Chaok Seok,et al.  Prediction of protein oligomer structures using GALAXY in CASP13 , 2019, Proteins.

[2]  C. Radauer,et al.  Allergen databases—A critical evaluation , 2019, Allergy.

[3]  L. Babe,et al.  Allergenicity prediction of novel and modified proteins: Not a mission impossible! Development of a Random Forest allergenicity prediction model. , 2019, Regulatory toxicology and pharmacology : RTP.

[4]  H. Breiteneder,et al.  The WHO/IUIS Allergen Nomenclature , 2019, Allergy.

[5]  Sebastian Maurer-Stroh,et al.  AllerCatPro—prediction of protein allergenicity potential from the protein sequence , 2019, Bioinform..

[6]  J. Davies,et al.  WHO/IUIS Allergen Nomenclature: Providing a common language , 2018, Molecular immunology.

[7]  V. K. Jayaraman,et al.  AllerBase: a comprehensive allergen knowledgebase , 2017, Database J. Biol. Databases Curation.

[8]  C. Radauer Navigating through the Jungle of Allergens: Features and Applications of Allergen Databases , 2017, International Archives of Allergy and Immunology.

[9]  Cathy H. Wu,et al.  UniProt: the universal protein knowledgebase , 2016, Nucleic Acids Research.

[10]  Steve L. Taylor,et al.  AllergenOnline: A peer-reviewed, curated allergen database to assess novel food proteins for potential cross-reactivity. , 2016, Molecular nutrition & food research.

[11]  Ping Song,et al.  1:1 FASTA update: Using the power of E-values in FASTA to detect potential allergen cross-reactivity , 2015, Toxicology reports.

[12]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[13]  S. Kumpatla,et al.  Evaluation of global sequence comparison and one-to-one FASTA local alignment in regulatory allergenicity assessment of transgenic proteins in food crops. , 2014, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[14]  C. Radauer,et al.  Update of the WHO/IUIS Allergen Nomenclature Database based on analysis of allergen sequences , 2014, Allergy.

[15]  Steve L. Taylor,et al.  Criteria used to categorise proteins as allergens for inclusion in allergenonline.org: a curated database for risk assessment , 2014, Clinical and Translational Allergy.

[16]  G. Ladics,et al.  Comparative assessment of multiple criteria for the in silico prediction of cross-reactivity of proteins to known allergens. , 2013, Regulatory toxicology and pharmacology : RTP.

[17]  W. Hemmer,et al.  Inhibition of IgE binding to cross-reactive carbohydrate determinants enhances diagnostic selectivity , 2013, Allergy.

[18]  C. Schein,et al.  Assessment of 3D models for allergen research , 2013, Proteins.

[19]  Bjoern Peters,et al.  Strategies to Query and Display Allergy-Derived Epitope Data from the Immune Epitope Database , 2012, International Archives of Allergy and Immunology.

[20]  C. Schein,et al.  AllerML: markup language for allergens. , 2011, Regulatory toxicology and pharmacology : RTP.

[21]  R. Goodman,et al.  Suggested Improvements for the Allergenicity Assessment of Genetically Modified Plants Used in Foods , 2011, Current allergy and asthma reports.

[22]  H. Kuiper,et al.  Scientific Opinion on the assessment of allergenicity of GM plants andmicroorganisms and derived food and feed , 2010 .

[23]  R. Herman,et al.  Value of eight-amino-acid matches in predicting the allergenicity status of proteins: an empirical bioinformatic investigation , 2009, Clinical and molecular allergy : CMA.

[24]  Adriano Mari,et al.  Allergen databases: Current status and perspectives , 2009, Current allergy and asthma reports.

[25]  G. Bannon,et al.  The use of E-scores to determine the quality of protein alignments. , 2009, Regulatory toxicology and pharmacology : RTP.

[26]  G. Ladics,et al.  Further evaluation of the utility of "sliding window" FASTA in predicting cross-reactivity with allergenic proteins. , 2009, Regulatory toxicology and pharmacology : RTP.

[27]  Surendra S. Negi,et al.  Comprehensive 3D-modeling of allergenic proteins and amino acid composition of potential conformational IgE epitopes. , 2008, Molecular immunology.

[28]  C. Radauer,et al.  Allergens are distributed into few protein families and possess a restricted number of biochemical functions. , 2008, The Journal of allergy and clinical immunology.

[29]  Heimo Breiteneder,et al.  Nomenclature and structural biology of allergens. , 2007, The Journal of allergy and clinical immunology.

[30]  Adriano Mari,et al.  Bioinformatics applied to allergy: allergen databases, from collecting sequence information to data integration. The Allergome platform as a model. , 2006, Cellular immunology.

[31]  Ping Song,et al.  The value of short amino acid sequence matches for prediction of protein allergenicity. , 2006, Toxicological sciences : an official journal of the Society of Toxicology.

[32]  Richard E. Goodman,et al.  Assessing Genetically Modified Crops to Minimize the Risk of Increased Food Allergy: A Review , 2005, International Archives of Allergy and Immunology.

[33]  Roeland C. H. J. van Ham,et al.  Allermatch™, a webtool for the prediction of potential allergenicity according to current FAO/WHO Codex alimentarius guidelines , 2004, BMC Bioinformatics.

[34]  I Kimber,et al.  Assessment of the safety of foods derived from genetically modified (GM) crops. , 2004, Food and chemical toxicology : an international journal published for the British Industrial Biological Research Association.

[35]  C. Radauer,et al.  A classification of plant food allergens. , 2004, The Journal of allergy and clinical immunology.

[36]  Werner Braun,et al.  Data mining of sequences and 3D structures of allergenic proteins , 2002, Bioinform..

[37]  S. Taylor,et al.  Will genetically modified foods be allergenic? , 2001, The Journal of allergy and clinical immunology.

[38]  T. Platts-Mills,et al.  Allergen nomenclature * , 1994, Bulletin of the World Health Organization.

[39]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[40]  P. Song Bioinformatics Application in Regulatory Assessment for Potential Allergenicity of Transgenic Proteins in Food Crops , 2016 .

[41]  M. Gavrović-Jankulović,et al.  Predicting Potential Allergenicity of New proteins Introduced by Biotechnology , 2014 .

[42]  S. Saha,et al.  Allergen databases. , 2014, Methods in molecular biology.

[43]  Steve L. Taylor,et al.  Allergenicity assessment of genetically modified crops—what makes sense? , 2008, Nature Biotechnology.

[44]  Thirty-Second Session JOINT FAO/WHO FOOD STANDARDS PROGRAMME , 2007 .

[45]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[46]  Sheldon G. Cohen,et al.  Allergen immunotherapy in historical perspective. , 2004, Clinical allergy and immunology.

[47]  Werner Braun,et al.  SDAP: database and computational tools for allergenic proteins , 2003, Nucleic Acids Res..

[48]  Evaluation of Allergenicity of Genetically Modified Foods Report of a Joint FAO / WHO Expert Consultation on Allergenicity of Foods Derived from Biotechnology , 2001 .

[49]  S. Taylor,et al.  Assessment of the allergenic potential of foods derived from genetically engineered crop plants. , 1996, Critical reviews in food science and nutrition.