论文信息 - Boosting compound-protein interaction prediction by deep learning

Boosting compound-protein interaction prediction by deep learning

The identification of interactions between compounds and proteins plays an important role in network pharmacology and drug discovery. However, experimentally identifying compound-protein interactions (CPIs) is generally expensive and time-consuming, computational approaches are thus introduced. Among these, machine-learning based methods have achieved a considerable success. However, due to the nonlinear and imbalanced nature of biological data, many machine learning approaches have their own limitations. Recently, deep learning techniques show advantages over many state-of-the-art machine learning methods in many applications. In this study, we aim at improving the performance of CPI prediction based on deep learning, and propose a method called DL-CPI (the abbreviation of Deep Learning for Compound-Protein Interactions prediction), which employs deep neural network (DNN) to effectively learn the representations of compound-protein pairs. Extensive experiments show that DL-CPI can learn useful features of compound-protein pairs by a layerwise abstraction, and thus achieves better prediction performance than existing methods on both balanced and imbalanced datasets.

[1] Chuang Liu,et al. Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference , 2012, PLoS Comput. Biol..

[2] Mark McClellan,et al. Commentary: Tackling the Challenges of Developing Targeted Therapies for Cancer , 2010, The oncologist.

[3] Hyunju Lee,et al. Predicting Drug-Target Interactions Using Drug-Drug Interactions , 2013, PloS one.

[4] Hong Liu,et al. Computational Screening for Active Compounds Targeting Protein Sequences: Methodology and Experimental Validation , 2011, J. Chem. Inf. Model..

[5] Yadi Zhou,et al. Prediction of chemical-protein interactions: multitarget-QSAR versus computational chemogenomic methods. , 2012, Molecular bioSystems.

[6] Pierre Baldi,et al. Deep autoencoder neural networks for gene ontology annotation predictions , 2014, BCB.

[7] Jianlin Cheng,et al. A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[8] Chang Liu,et al. Predicting Drug–Target Interactions Using Probabilistic Matrix Factorization , 2013, J. Chem. Inf. Model..

[9] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10] R. W. Hansen,et al. The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[11] Tapio Pahikkala,et al. Toward more realistic drug^target interaction predictions , 2014 .

[12] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .

[13] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[14] Pierre Baldi,et al. Deep Spatio-Temporal Architectures and Learning for Protein Structure Prediction , 2012, NIPS.

[15] Kai Huang,et al. PharmMapper server: a web server for potential drug target identification using pharmacophore mapping approach , 2010, Nucleic Acids Res..

[16] Antonio Lavecchia,et al. Machine-learning approaches in drug discovery: methods and applications. , 2015, Drug discovery today.

[17] Xiaohui Xie,et al. DANN: a deep learning approach for annotating the pathogenicity of genetic variants , 2015, Bioinform..

[18] Haibo He,et al. Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[19] Hermann Ney,et al. Cross-entropy vs. squared error training: a theoretical and experimental comparison , 2013, INTERSPEECH.

[20] David A. Freedman,et al. Statistical Models: Theory and Practice: References , 2005 .

[21] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.

[22] Damian Szklarczyk,et al. STITCH 4: integration of protein–chemical interactions with user data , 2013, Nucleic Acids Res..

[23] E. Birney,et al. Pfam: the protein families database , 2013, Nucleic Acids Res..

[24] Yadi Zhou,et al. Prediction of Chemical-Protein Interactions Network with Weighted Network-Based Inference Method , 2012, PloS one.

[25] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[26] Elena Marchiori,et al. Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[27] Yanli Wang,et al. PubChem: a public information system for analyzing bioactivities of small molecules , 2009, Nucleic Acids Res..

[28] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[29] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[30] Hua Yu,et al. A Systematic Prediction of Multiple Drug-Target Interactions from Chemical, Genomic, and Pharmacological Data , 2012, PloS one.

[31] Brendan J. Frey,et al. Deep learning of the tissue-regulated splicing code , 2014, Bioinform..

[32] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[33] Yuhao Wang,et al. Predicting drug-target interactions using restricted Boltzmann machines , 2013, Bioinform..

[34] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[35] Xiaomin Luo,et al. TarFisDock: a web server for identifying drug targets with docking approach , 2006, Nucleic Acids Res..

[36] A. Mobasheri,et al. Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology. , 2013, Omics : a journal of integrative biology.

[37] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Yasuo Tabei,et al. Scalable prediction of compound-protein interactions using minwise hashing , 2013, BMC Systems Biology.