B-cell Epitope Prediction Method based on Deep Ensemble Architecture and Sequences

Predicting B-cell epitope is aiming the design of molecules that can mimic the structure and function of a genuine epitope and replace it in antibody diagnostics and therapeutics, as well as the design of the potentially safer vaccine. It is an essential step for immune therapy. Conformational B-cell epitope prediction based on antigen sequence is still a challenge for researchers. In this work, we construct a deep ensemble architecture for B-cell epitope prediction based on antigen sequences. We adopted one hot vector coding and physico-chemical properties schemes for encoding protein sequence fragments, and constructed seven independent convolutional neural network, then the weighted average method was used to integrate the seven networks. The proposed method is evaluated on the testing datasets of BepiPred 2.0. The experimental results show that the method achieves an AUC of 0.771, a sensitivity of 0.711, and a MCC of 0.222. In addition, we also evaluate the performance of our method on 13 independent testing cases and the results are superior to the existing methods. The comparisons show that our method is capable of predicting conformational B-cell epitopes based on a deep ensemble framework and antigen sequences in an acceptable level of AUC. The codes along with instructions to reproduce this work are available from https://github.com/yangycoding/DeepBcellPre.

[1]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[2]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[3]  Pierre Baldi,et al.  PEPITO: improved discontinuous B-cell epitope prediction using multiple distance thresholds and half sphere exposure , 2008, Bioinform..

[4]  Chi Zhang,et al.  Prediction of antigenic epitopes on protein surfaces by consensus scoring , 2009, BMC Bioinformatics.

[5]  Hong-Bin Shen,et al.  RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach , 2016, BMC Bioinformatics.

[6]  Ponnuthurai N. Suganthan,et al.  Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [Review Article] , 2016, IEEE Computational Intelligence Magazine.

[7]  Yanchun Liang,et al.  MusiteDeep: a deep‐learning framework for general and kinase‐specific phosphorylation site prediction , 2017, Bioinform..

[8]  Chin-Teng Lin,et al.  Protein Metal Binding Residue Prediction Based on Neural Networks , 2004, ICONIP.

[9]  M. V. Van Regenmortel,et al.  Antigenicity and immunogenicity of synthetic peptides. , 2001, Biologicals : journal of the International Association of Biological Standardization.

[10]  Juan Liu,et al.  Computational Prediction of Conformational B-Cell Epitopes from Antigen Primary Structures by Ensemble Learning , 2012, PloS one.

[11]  Bjoern Peters,et al.  BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes , 2017, Nucleic Acids Res..

[12]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[13]  M. Kanehisa,et al.  Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins. , 1996, Protein engineering.

[14]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[15]  J. Thornton,et al.  Continuous and discontinuous protein antigenic determinants , 1986, Nature.

[16]  O. Lund,et al.  Prediction of residues in discontinuous B‐cell epitopes using protein 3D structures , 2006, Protein science : a publication of the Protein Society.

[17]  Gajendra PS Raghava,et al.  Identification of conformational B-cell Epitopes in an antigen from its primary sequence , 2010, Immunome research.

[18]  Yaoqi Zhou,et al.  BEST: Improved Prediction of B-Cell Epitopes from Antigen Sequences , 2012, PloS one.

[19]  Xiaowei Zhao,et al.  Conformational B-Cell Epitopes Prediction from Sequences Using Cost-Sensitive Ensemble Classifiers and Spatial Clustering , 2014, BioMed research international.

[20]  Pierre Baldi,et al.  COBEpro: a novel system for predicting continuous B-cell epitopes. , 2009, Protein engineering, design & selection : PEDS.

[21]  Jean-Luc Pellequer,et al.  BEPITOPE: predicting the location of continuous epitopes and patterns in proteins , 2003, Journal of molecular recognition : JMR.

[22]  Wei Li,et al.  ElliPro: a new structure-based tool for the prediction of antibody epitopes , 2008, BMC Bioinformatics.