Exploration of DPP-IV inhibitory peptide design rules assisted by deep learning pipeline that identifies restriction enzyme cutting site

Mining of anti-diabetic dipeptidyl peptidase IV (DPP-IV) inhibitory peptides (DPP-IV-IPs) is currently a costly and laborious process. Due to the absence of rational peptide design rules, it relies on cumbersome screening of unknown enzyme hydrolysates. Here, we present an enhanced deep learning (DL) model called BERT-DPPIV, specifically designed to classify DPP-IV-IPs and exploring their design rules to discover potent candidates. The end-to-end model utilizes a fine-tuned bidirectional encoder representations (BERT) architecture to extract structural/functional information from input peptides and accurately identify DPP-IV-Ips from input peptides. Experimental results in benchmark dataset showed BERT-DPPIV yielded state-of-the-art accuracy of 0.894, surpassing the 0.797 obtained by sequence-feature model. Furthermore, we leverage the attention mechanism to uncover that our model could recognize restriction enzyme cutting site and specific residues that contribute to the inhibition of DPP-IV. Moreover, guided by BERT-DPPIV, proposed design rules of DPP-IV inhibitory tripeptides and pentapeptides were validated and they can be used to screen potent DPP-IV-IPs.

[1]  P. Lio’,et al.  StackDPPIV: a novel computational approach for accurate prediction of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides. , 2021, Methods.

[2]  Zhijian Yin,et al.  Identifying Dipeptidyl Peptidase-IV Inhibitory Peptides Based on Correlation Information of Physicochemical Properties , 2021, International Journal of Peptide Research and Therapeutics.

[3]  C. Hewage,et al.  APPTEST is a novel protocol for the automatic prediction of peptide tertiary structures , 2021, Briefings in bioinformatics.

[4]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[5]  X. Xing,et al.  [Functional discovery and production technology for natural bioactive peptides]. , 2021, Sheng wu gong cheng xue bao = Chinese journal of biotechnology.

[6]  Xiangxiang Zeng,et al.  A novel antibacterial peptide recognition algorithm based on BERT , 2021, Briefings Bioinform..

[7]  X. Xing,et al.  Strategic Preparations of DPP-IV Inhibitory Peptides from Val-Pro-Xaa and Ile-Pro-Xaa Peptide Mixtures , 2020 .

[8]  Zhihan Zhou,et al.  DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome , 2020, bioRxiv.

[9]  Chanin Nantasenamat,et al.  iDPPIV-SCM: A sequence-based predictor for identifying and analyzing dipeptidyl peptidase IV (DPP-IV) inhibitory peptides using a scoring card method. , 2020, Journal of proteome research.

[10]  B. Rost,et al.  ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing , 2020, bioRxiv.

[11]  Hio Kuan Tai,et al.  Deep-AmPEP30: Improve Short Antimicrobial Peptides Prediction with Deep Learning , 2020, Molecular therapy. Nucleic acids.

[12]  Balachandran Manavalan,et al.  Machine intelligence in peptide therapeutics: A next‐generation tool for rapid disease screening , 2020, Medicinal research reviews.

[13]  J. Cherrie,et al.  Machine Learning and Deep Learning , 2019, International Journal of Innovative Technology and Exploring Engineering.

[14]  Rui Gao,et al.  PTPD: predicting therapeutic peptides by deep learning and word2vec , 2019, BMC Bioinformatics.

[15]  Jesse Vig,et al.  A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.

[16]  Konstantinos D. Tsirigos,et al.  SignalP 5.0 improves signal peptide predictions using deep neural networks , 2019, Nature Biotechnology.

[17]  Chengfei Yan,et al.  MDockPeP: An ab‐initio protein–peptide docking server , 2018, J. Comput. Chem..

[18]  Peng Qiu,et al.  Long short-term memory recurrent neural networks for antibacterial peptide identification , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[19]  Gisbert Schneider,et al.  modlAMP: Python for antimicrobial peptides , 2017, Bioinform..

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  A. Nongonierma,et al.  Learnings from quantitative structure–activity relationship (QSAR) studies with respect to food protein-derived bioactive peptides: a review , 2016 .

[22]  A. Nongonierma,et al.  Structure activity relationship modelling of milk protein-derived peptides with dipeptidyl peptidase IV (DPP-IV) inhibitory activity , 2016, Peptides.

[23]  A. Nongonierma,et al.  An in silico model to predict the potential of dietary proteins as sources of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides. , 2014, Food chemistry.

[24]  D. Drucker,et al.  Pharmacology, Physiology, and Mechanisms of Action of Dipeptidyl Peptidase-4 Inhibitors , 2014, Endocrine reviews.

[25]  D. Shields,et al.  In silico approaches to predict the potential of milk protein-derived peptides as dipeptidyl peptidase IV (DPP-IV) inhibitors , 2014, Peptides.

[26]  B. Ahrén Dipeptidyl Peptidase-4 Inhibitors , 2007, Diabetes Care.

[27]  A. Barnett,et al.  DPP‐4 inhibitors and their potential role in the management of type 2 diabetes , 2006, International journal of clinical practice.

[28]  S. Wold,et al.  Peptide quantitative structure-activity relationships, a multivariate approach. , 1987, Journal of medicinal chemistry.

[29]  A. Nongonierma,et al.  Features of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides from dietary proteins. , 2019, Journal of food biochemistry.

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Apilak Worachartcheewan,et al.  Towards the Revival of Interpretable QSAR Models , 2017 .

[32]  A. Passmore,et al.  Untargeted Metabolomic Analysis of Human Plasma Indicates Differentially Affected Polyamine and L-Arginine Metabolism in Mild Cognitive Impairment Subjects Converting to Alzheimer’s Disease , 2015, PloS one.