Combining Evolutionary Information and Sparse Bayesian Probability Model to Accurately Predict Self-interacting Proteins

Self-interacting proteins (SIPs) play a crucial role in investigation of various biochemical developments. In this work, a novel computational method was proposed for accelerating SIPs validation only using protein sequence. Firstly, the protein sequence was represented as Position-Specific Weight Matrix (PSWM) containing protein evolutionary information. Then, we incorporated the Legendre Moment (LM) and Sparse Principal Component Analysis (SPCA) to extract essential and anti-noise evolutionary feature from the PSWM. Finally, we utilized robust Probabilistic Classification Vector Machine (PCVM) classifier to carry out prediction. In the cross-validated experiment, the proposed method exhibits high accuracy performance with 95.54% accuracy on S.erevisiae dataset, which is a significant improvement compared to several competing SIPs predictors. The empirical test reveal that the proposed method can efficiently extracts salient features from protein sequences and accurately predict potential SIPs.

[1]  Xing Chen,et al.  Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier , 2017, Oncotarget.

[2]  Zhu-Hong You,et al.  RP-FIRF: Prediction of Self-interacting Proteins Using Random Projection Classifier Combining with Finite Impulse Response Filter , 2018, ICIC.

[3]  Zhu-Hong You,et al.  Identifying Spurious Interactions in the Protein-Protein Interaction Networks Using Local Similarity Preserving Embedding , 2014, ISBRA.

[4]  Yang Li,et al.  PCLPred: A Bioinformatics Method for Predicting Protein–Protein Interactions by Combining Relevance Vector Machine Model with Low-Rank Matrix Approximation , 2018, International journal of molecular sciences.

[5]  Hai-Cheng Yi,et al.  Prediction of Self-Interacting Proteins from Protein Sequence Information Based on Random Projection Model and Fast Fourier Transform , 2019, International journal of molecular sciences.

[6]  Zhu-Hong You,et al.  Detecting Protein-Protein Interactions with a Novel Matrix-Based Protein Sequence Representation and Support Vector Machines , 2015, BioMed research international.

[7]  Zhu-Hong You,et al.  An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8]  Yangming Li,et al.  An Improved Deep Forest Model for Predicting Self-Interacting Proteins From Protein Sequence Using Wavelet Transformation , 2019, Front. Genet..

[9]  Tonghai Jiang,et al.  Predicting Protein Interactions Using a Deep Learning Method-Stacked Sparse Autoencoder Combined with a Probabilistic Classification Vector Machine , 2018, Complex..

[10]  Yin Wang,et al.  RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences , 2016, International journal of molecular sciences.

[11]  Zhen Ji,et al.  Assessing and predicting protein interactions by combining manifold embedding with multiple information integration , 2012, BMC Bioinformatics.

[12]  Yong Zhou,et al.  Computational methods using weighed-extreme learning machine to predict protein self-interactions with protein evolutionary information , 2017, Journal of Cheminformatics.

[13]  Arnaud Gautier,et al.  Selective cross-linking of interacting proteins using self-labeling tags. , 2009, Journal of the American Chemical Society.

[14]  Zhu-Hong You,et al.  A SVM-Based System for Predicting Protein-Protein Interactions Using a Novel Representation of Protein Sequences , 2013, ICIC.

[15]  Hareton K. N. Leung,et al.  Improving network topology-based protein interactome mapping via collaborative filtering , 2015, Knowl. Based Syst..

[16]  Xing Chen,et al.  Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics , 2016, International journal of molecular sciences.

[17]  Hai-Cheng Yi,et al.  A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information , 2018, Molecular therapy. Nucleic acids.

[18]  Zhu-Hong You,et al.  Increasing the reliability of protein-protein interaction networks via non-convex semantic embedding , 2013, Neurocomputing.

[19]  Zhu-Hong You,et al.  Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis , 2013, BMC Bioinformatics.

[20]  Xing Chen,et al.  Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition , 2016, BMC Systems Biology.

[21]  Zhen Ji,et al.  Prediction of protein-protein interactions from amino acid sequences using extreme learning machine combined with auto covariance descriptor , 2013, 2013 IEEE Workshop on Memetic Computing (MC).

[22]  Xiao Li,et al.  A High Efficient Biological Language Model for Predicting Protein–Protein Interactions , 2019, Cells.

[23]  Zhu-Hong You,et al.  Prediction of protein-protein interactions by label propagation with protein evolutionary and chemical information derived from heterogeneous network. , 2017, Journal of theoretical biology.

[24]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[25]  Zhu-Hong You,et al.  An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences , 2016, Oncotarget.

[26]  Xing Chen,et al.  Improving protein–protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model , 2016, Protein science : a publication of the Protein Society.

[27]  Xing Chen,et al.  Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network. , 2017, Molecular bioSystems.

[28]  Jian Wang,et al.  Proteome-wide Prediction of Self-interacting Proteins Based on Multiple Properties* , 2013, Molecular & Cellular Proteomics.

[29]  Zhu-Hong You,et al.  Using the Relevance Vector Machine Model Combined with Local Phase Quantization to Predict Protein-Protein Interactions from Protein Sequences , 2016, BioMed research international.

[30]  Nikolay V Dokholyan,et al.  Natural selection against protein aggregation on self-interacting and essential proteins in yeast, fly, and worm. , 2008, Molecular biology and evolution.

[31]  Shuai Li,et al.  A MapReduce based parallel SVM for large-scale predicting protein-protein interactions , 2014, Neurocomputing.

[32]  Zhan-Heng Chen,et al.  An Ensemble Classifier with Random Projection for Predicting Protein-Protein Interactions Using Sequence and Evolutionary Information , 2018 .

[33]  Hai-Cheng Yi,et al.  ACP-DL: A Deep Learning Long Short-Term Memory Model to Predict Anticancer Peptides Using High-Efficiency Feature Representation , 2019, Molecular therapy. Nucleic acids.

[34]  Zhu-Hong You,et al.  Predicting Protein-Protein Interactions from Amino Acid Sequences Using SaE-ELM Combined with Continuous Wavelet Descriptor and PseAA Composition , 2015, ICIC.

[35]  SHENG-YOU HUANG,et al.  An iterative knowledge‐based scoring function to predict protein–ligand interactions: I. Derivation of interaction potentials , 2006, J. Comput. Chem..

[36]  Xing Chen,et al.  Construction of reliable protein-protein interaction networks using weighted sparse representation based classifier with pseudo substitution matrix representation features , 2016, Neurocomputing.

[37]  Zhu-Hong You,et al.  An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers , 2017, Neurocomputing.

[38]  Hai-Cheng Yi,et al.  Detection of Interactions between Proteins by Using Legendre Moments Descriptor to Extract Discriminatory Information Embedded in PSSM , 2017, Molecules.

[39]  Yong Zhou,et al.  Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier. , 2017, Journal of theoretical biology.

[40]  Zhu-Hong You,et al.  Prediction of protein self-interactions using stacked long short-term memory from protein sequences information , 2018, BMC Systems Biology.

[41]  Xing Chen,et al.  Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding , 2016, BMC Bioinformatics.

[42]  Zhen Ji,et al.  Large-Scale Protein-Protein Interactions Detection by Integrating Big Biosensing Data with Computational Model , 2014, BioMed research international.

[43]  Zhu-Hong You,et al.  Detection of Interactions between Proteins through Rotation Forest and Local Phase Quantization Descriptors , 2015, International journal of molecular sciences.

[44]  Zhu-Hong You,et al.  Improving Prediction of Self-interacting Proteins Using Stacked Sparse Auto-Encoder with PSSM profiles , 2018, International journal of biological sciences.

[45]  Jiangning Song,et al.  SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information , 2016, Amino Acids.

[46]  M. Othman,et al.  Anaerobic Codigestion of Municipal Wastewater Treatment Plant Sludge with Food Waste: A Case Study , 2016, BioMed research international.

[47]  Zhu-Hong You,et al.  Identifying Spurious Interactions in the Protein-Protein Interaction Networks Using Local Similarity Preserving Embedding , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[48]  T. Barrette,et al.  Probabilistic model of the human protein-protein interaction network , 2005, Nature Biotechnology.

[49]  Xing Chen,et al.  Robust and accurate prediction of protein self-interactions from amino acids sequence using evolutionary information. , 2016, Molecular bioSystems.

[50]  Xing Chen,et al.  PSPEL: In Silico Prediction of Self-Interacting Proteins from Amino Acids Sequences Using Ensemble Learning , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  Zhu-Hong You,et al.  Increasing reliability of protein interactome by fast manifold embedding , 2013, Pattern Recognit. Lett..

[52]  Xing Chen,et al.  Identification of self-interacting proteins by exploring evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix , 2016, Oncotarget.

[53]  Xing Chen,et al.  PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein–Protein Interactions from Protein Sequences , 2017, International journal of molecular sciences.

[54]  Zhu-Hong You,et al.  Using Weighted Extreme Learning Machine Combined With Scale-Invariant Feature Transform to Predict Protein-Protein Interactions From Protein Evolutionary Information , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[55]  Zhu-Hong You,et al.  Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data , 2010, Bioinform..

[56]  MengChu Zhou,et al.  Highly Efficient Framework for Predicting Interactions Between Proteins , 2017, IEEE Transactions on Cybernetics.