Predicting protein‐protein interactions through sequence‐based deep learning

Motivation High‐throughput experimental techniques have produced a large amount of protein‐protein interaction (PPI) data, but their coverage is still low and the PPI data is also very noisy. Computational prediction of PPIs can be used to discover new PPIs and identify errors in the experimental PPI data. Results We present a novel deep learning framework, DPPI, to model and predict PPIs from sequence information alone. Our model efficiently applies a deep, Siamese‐like convolutional neural network combined with random projection and data augmentation to predict PPIs, leveraging existing high‐quality experimental PPI data and evolutionary information of a protein pair under prediction. Our experimental results show that DPPI outperforms the state‐of‐the‐art methods on several benchmarks in terms of area under precision‐recall curve (auPR), and computationally is more efficient. We also show that DPPI is able to predict homodimeric interactions where other methods fail to work accurately, and the effectiveness of DPPI in specific applications such as predicting cytokine‐receptor binding affinities. Availability and implementation Predicting protein‐protein interactions through sequence‐based deep learning): https://github.com/hashemifar/DPPI/. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[2]  Haiyuan Yu,et al.  HINT: High-quality protein interactomes and their applications in understanding human disease , 2012, BMC Systems Biology.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Zhu-Hong You,et al.  Using Weighted Sparse Representation Model Combined with Discrete Cosine Transformation to Predict Protein-Protein Interactions from Protein Sequence , 2015, BioMed research international.

[5]  S. Brunak,et al.  A scored human protein–protein interaction network to catalyze genomic interpretation , 2017, Nature Methods.

[6]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[7]  Grgoire Montavon,et al.  Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.

[8]  S. Jones,et al.  Principles of protein-protein interactions. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  D. Baker,et al.  Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information , 2014, eLife.

[10]  Jean-Loup Faulon,et al.  Predicting protein-protein interactions using signature products , 2005, Bioinform..

[11]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[12]  Burkhard Rost,et al.  Evolutionary profiles improve protein-protein interaction prediction from sequence , 2015, Bioinform..

[13]  Shuai Li,et al.  Detection of Protein-Protein Interactions from Amino Acid Sequences Using a Rotation Forest Model with a Novel PR-LPQ Descriptor , 2015, ICIC.

[14]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[15]  Yanzhi Guo,et al.  Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences , 2008, Nucleic acids research.

[16]  Jie Gui,et al.  Prediction of protein-protein interactions from protein sequence using local descriptors. , 2010, Protein and peptide letters.

[17]  Jacob Piehler,et al.  Instructive roles for cytokine-receptor binding parameters in determining signaling and functional potency , 2015, Science Signaling.

[18]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[19]  Thomas A. Hopf,et al.  Sequence co-evolution gives 3D contacts and structures of protein complexes , 2014, eLife.

[20]  William Stafford Noble,et al.  Learning to predict protein-protein interactions from protein sequences , 2003, Bioinform..

[21]  Yun Gao,et al.  Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence , 2011 .

[22]  Tianwei Yu,et al.  K-Profiles: A Nonlinear Clustering Method for Pattern Detection in High Dimensional Data , 2015, BioMed research international.

[23]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[24]  Ashkan Golshani,et al.  Short Co-occurring Polypeptide Regions Can Predict Global Protein Interaction Maps , 2012, Scientific Reports.

[25]  William Stafford Noble,et al.  Kernel methods for predicting protein-protein interactions , 2005, ISMB.

[26]  Aaron C. Courville,et al.  Recurrent Batch Normalization , 2016, ICLR.

[27]  R. Norel,et al.  Electrostatic aspects of protein-protein interactions. , 2000, Current opinion in structural biology.

[28]  Zhen Ji,et al.  Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set , 2014, BMC Bioinformatics.

[29]  Yu Yao,et al.  DeepPPI: Boosting Prediction of Protein-Protein Interactions with Deep Neural Networks , 2017, J. Chem. Inf. Model..

[30]  Honghua Tan,et al.  Advances in Computer Science and Education Applications , 2011 .

[31]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[32]  M. Vidal,et al.  Protein interaction mapping in C. elegans using proteins involved in vulval development. , 2000, Science.

[33]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[34]  Jocelyn Sietsma,et al.  Creating artificial neural networks that generalize , 1991, Neural Networks.

[35]  Y. Freund,et al.  Profile-based string kernels for remote homology detection and motif extraction. , 2005, Journal of bioinformatics and computational biology.

[36]  Vijay S. Pande,et al.  Exploiting a natural conformational switch to engineer an Interleukin-2 superkine , 2012, Nature.

[37]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[38]  Luhua Lai,et al.  Sequence-based prediction of protein protein interaction using a deep-learning algorithm , 2017, BMC Bioinformatics.

[39]  Martin H. Schaefer,et al.  HIPPIE: Integrating Protein Interaction Networks with Experiment Based Quality Scores , 2012, PloS one.

[40]  Zhu-Hong You,et al.  Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis , 2013, BMC Bioinformatics.