Protein–Protein Interactions Prediction via Multimodal Deep Polynomial Network and Regularized Extreme Learning Machine

Predicting the protein–protein interactions (PPIs) has played an important role in many applications. Hence, a novel computational method for PPIs prediction is highly desirable. PPIs endow with protein amino acid mutation rate and two physicochemical properties of protein (e.g., hydrophobicity and hydrophilicity). Deep polynomial network (DPN) is well-suited to integrate these modalities since it can represent any function on a finite sample dataset via the supervised deep learning algorithm. We propose a multimodal DPN (MDPN) algorithm to effectively integrate these modalities to enhance prediction performance. MDPN consists of a two-stage DPN, the first stage feeds multiple protein features into DPN encoding to obtain high-level feature representation while the second stage fuses and learns features by cascading three types of high-level features in the DPN encoding. We employ a regularized extreme learning machine to predict PPIs. The proposed method is tested on the public dataset of H. pylori, Human, and Yeast and achieves average accuracies of 97.87%, 99.90%, and 98.11%, respectively. The proposed method also achieves good accuracies on other datasets. Furthermore, we test our method on three kinds of PPI networks and obtain superior prediction results.

[1]  Hongbin Shen,et al.  Large-scale prediction of human protein-protein interactions from amino acid sequence based on latent topic features. , 2010, Journal of proteome research.

[2]  Roger J. Davis,et al.  Transcriptional regulation by MAP kinases , 1995, Molecular reproduction and development.

[3]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[4]  Xiaofeng Zhu,et al.  Efficient kNN Classification With Different Numbers of Nearest Neighbors , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[6]  Shihui Ying,et al.  Multimodal Neuroimaging Feature Learning With Multimodal Stacked Deep Polynomial Networks for Diagnosis of Alzheimer's Disease , 2018, IEEE Journal of Biomedical and Health Informatics.

[7]  Shihui Ying,et al.  Histopathological Image Classification With Color Pattern Random Binary Hashing-Based PCANet and Matrix-Form Classifier , 2017, IEEE Journal of Biomedical and Health Informatics.

[8]  Ronald M. Summers,et al.  Deep Learning in Medical Imaging: Overview and Future Promise of an Exciting New Technique , 2016 .

[9]  Xing Chen,et al.  Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition , 2016, BMC Systems Biology.

[10]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Loris Nanni,et al.  An ensemble of K-local hyperplanes for predicting protein-protein interactions , 2006, Bioinform..

[12]  Zhu-Hong You,et al.  Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis , 2013, BMC Bioinformatics.

[13]  Ioannis Xenarios,et al.  DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions , 2002, Nucleic Acids Res..

[14]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[15]  Sheng Chen,et al.  Orthogonal least squares methods and their application to non-linear system identification , 1989 .

[16]  Xiaofeng Zhu,et al.  Dynamic graph learning for spectral feature selection , 2018, Multimedia Tools and Applications.

[17]  R. H. Myers Classical and modern regression with applications , 1986 .

[18]  Xing Chen,et al.  Construction of reliable protein-protein interaction networks using weighted sparse representation based classifier with pseudo substitution matrix representation features , 2016, Neurocomputing.

[19]  Roi Livni,et al.  An Algorithm for Training Polynomial Networks , 2013, 1304.7045.

[20]  Fei Wang,et al.  SEMG-based hand motion recognition using cumulative residual entropy and extreme learning machine , 2012, Medical & Biological Engineering & Computing.

[21]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[22]  Loris Nanni,et al.  An empirical study on the matrix-based protein representations and their combination with sequence-based approaches , 2012, Amino Acids.

[23]  Bai Ying Lei,et al.  Automatic Scoring of Multiple Semantic Attributes With Multi-Task Feature Leverage: A Study on Pulmonary Nodules in CT Images , 2017, IEEE Transactions on Medical Imaging.

[24]  Yun Gao,et al.  Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence , 2011 .

[25]  Zhao Li,et al.  Identification of Protein-Protein Interactions by Detecting Correlated Mutation at the Interface , 2015, J. Chem. Inf. Model..

[26]  Zhu-Hong You,et al.  Using Weighted Sparse Representation Model Combined with Discrete Cosine Transformation to Predict Protein-Protein Interactions from Protein Sequence , 2015, BioMed research international.

[27]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  D. Shen,et al.  Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans , 2016, Scientific Reports.

[29]  Zhu-Hong You,et al.  Predicting Protein-Protein Interactions from Primary Protein Sequences Using a Novel Multi-Scale Local Feature Representation Scheme and the Random Forest , 2015, PloS one.

[30]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[32]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Qi Zhang,et al.  Ultrasound Image Based Tumor Classification via Deep Polynomial Network and Multiple Kernel Learning , 2018 .

[34]  Xiaoqi Zheng,et al.  Predicting subcellular location of apoptosis proteins with pseudo amino acid composition: approach from amino acid substitution matrix and auto covariance transformation , 2012, Amino Acids.

[35]  Zhu-Hong You,et al.  Prediction of protein-protein interactions by label propagation with protein evolutionary and chemical information derived from heterogeneous network. , 2017, Journal of theoretical biology.

[36]  Bai Ying Lei,et al.  Bridging Computational Features Toward Multiple Semantic Features with Multi-task Regression: A Study of CT Pulmonary Nodules , 2016, MICCAI.

[37]  Jean-Loup Faulon,et al.  Predicting protein-protein interactions using signature products , 2005, Bioinform..

[38]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[39]  Yanzhi Guo,et al.  Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences , 2008, Nucleic acids research.

[40]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[41]  Zi Huang,et al.  A Sparse Embedding and Least Variance Encoding Approach to Hashing , 2014, IEEE Transactions on Image Processing.

[42]  Eric Rubinstein,et al.  Contrasting Effects of EWI Proteins, Integrins, and Protein Palmitoylation on Cell Surface CD9 Organization* , 2006, Journal of Biological Chemistry.

[43]  Xiaofeng Zhu,et al.  A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis , 2014, NeuroImage.

[44]  Xiao Liu,et al.  Tumor Classification by Deep Polynomial Network and Multiple Kernel Learning on Small Ultrasound Image Dataset , 2015, MLMI.

[45]  Mark van Heeswijk,et al.  Binary/ternary extreme learning machines , 2015, Neurocomputing.

[46]  Loris Nanni,et al.  Hyperplanes for predicting protein-protein interactions , 2005, Neurocomputing.

[47]  Guang-Bin Huang,et al.  Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.

[48]  Zi Huang,et al.  Self-taught dimensionality reduction on the high-dimensional small-sized data , 2013, Pattern Recognit..

[49]  C. Mant,et al.  Monitoring the hydrophilicity/hydrophobicity of amino acid side-chains in the non-polar and polar faces of amphipathic alpha-helices by reversed-phase and hydrophilic interaction/cation-exchange chromatography. , 2004, Journal of chromatography. A.

[50]  Gary D Bader,et al.  Computational Prediction of Protein–Protein Interactions , 2008, Molecular biotechnology.

[51]  Dinggang Shen,et al.  Subspace Regularized Sparse Multitask Learning for Multiclass Neurodegenerative Disease Identification , 2016, IEEE Transactions on Biomedical Engineering.

[52]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[53]  Juwen Shen,et al.  Predicting protein–protein interactions based only on sequences information , 2007, Proceedings of the National Academy of Sciences.

[54]  Xing Chen,et al.  Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding , 2016, BMC Bioinformatics.

[55]  Xuelong Li,et al.  Block-Row Sparse Multiview Multilabel Learning for Image Classification , 2016, IEEE Transactions on Cybernetics.

[56]  Xuelong Li,et al.  Graph PCA Hashing for Similarity Search , 2017, IEEE Transactions on Multimedia.

[57]  Roi Livni,et al.  Vanishing Component Analysis , 2013, ICML.

[58]  Jijun Tang,et al.  Predicting protein-protein interactions via multivariate mutual information of protein sequences , 2016, BMC Bioinformatics.

[59]  Zhijie Wen,et al.  Manifold Preserving: An Intrinsic Approach for Semisupervised Distance Metric Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[60]  Niels H. Andersen Protein Structure, Stability, and Folding. Methods in Molecular Biology. Volume 168 Edited by Kenneth P. Murphy (University of Iowa College of Medicine). Humana Press: Totowa, New Jersey. 2001. ix + 252 pp. $89.50. ISBN 0-89603-682-0. , 2001 .

[61]  Xiaofeng Zhu,et al.  Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[62]  Guang-Bin Huang,et al.  Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions , 1998, IEEE Trans. Neural Networks.

[63]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[64]  Yi-Hong Chou,et al.  Boundary Regularized Convolutional Neural Network for Layer Parsing of Breast Anatomy in Automated Whole Breast Ultrasound , 2017, MICCAI.

[65]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.

[66]  Jieping Ye,et al.  Two-Dimensional Linear Discriminant Analysis , 2004, NIPS.

[67]  Tobias Hamp,et al.  Sequence-based prediction of protein-protein interactions , 2014 .

[68]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[69]  Shuai Li,et al.  Detection of Protein-Protein Interactions from Amino Acid Sequences Using a Rotation Forest Model with a Novel PR-LPQ Descriptor , 2015, ICIC.

[70]  Xiao Liu,et al.  Stacked deep polynomial network based representation learning for tumor classification with small ultrasound image dataset , 2016, Neurocomputing.

[71]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.