Prediction of anti‐HIV activity on the basis of stacked auto‐encoder

The prediction of biologically active compounds plays a very important role for high‐throughput screening approaches in drug discovery. Most computational models, in this area, concentrate on measuring structural similarities between chemical elements. There are various methods to predict anti‐HIV activity, such as artificial neural network and support vector machine, but generally using shallow machine learning with low accuracies and less samples. In this work, one of deep learning methods, stacked auto‐encoder (SAE), is proposed to predict anti‐HIV activity of a broad group of compounds for the first time. Through contrasting experiments of artificial neural network, support vector machine, and SAE under the same condition, the accuracy after descriptors screening is higher than using raw descriptors, and SAE performs better than the other two methods to achieve the perfect forecast of anti‐HIV activity. It has a great significance on promoting anti‐HIV drug design, which therefore can reduce research and development costs and improve the efficiency of anti‐HIV drug discovery.

[1]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[2]  Dong Yu,et al.  Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..

[3]  Rafael Bello,et al.  ANN-QSAR model for selection of anticancer leads from structurally heterogeneous series of compounds. , 2007, European journal of medicinal chemistry.

[4]  Rebecca Fiebrink,et al.  Cross-modal Sound Mapping Using Deep Learning , 2013, NIME.

[5]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[6]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[7]  S. Vilar,et al.  Probabilistic neural network model for the in silico evaluation of anti-HIV activity and mechanism of action. , 2006, Journal of medicinal chemistry.

[8]  Richard Jensen,et al.  Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regressions , 2009 .

[9]  Eslam Pourbasheer,et al.  Support Vector Machine‐Based Quantitative Structure–Activity Relationship Study of Cholesteryl Ester Transfer Protein Inhibitors , 2009, Chemical biology & drug design.

[10]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[11]  Pierre Baldi,et al.  Boolean autoencoders and hypercube clustering complexity , 2012, Designs, Codes and Cryptography.

[12]  Jagdish Chandra Patra,et al.  Artificial neural network‐based drug design for diabetes mellitus using flavonoids , 2011, J. Comput. Chem..

[13]  Johan A. K. Suykens,et al.  Weighted least squares support vector machines: robustness and sparse approximation , 2002, Neurocomputing.

[14]  Michael K. Gilson,et al.  Virtual Screening of Molecular Databases Using a Support Vector Machine , 2005, J. Chem. Inf. Model..

[15]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Antonio Lavecchia,et al.  Machine-learning approaches in drug discovery: methods and applications. , 2015, Drug discovery today.

[17]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Yanli Wang,et al.  Binary Classification of Aqueous Solubility Using Support Vector Machines with Reduction and Recombination Feature Selection , 2011, J. Chem. Inf. Model..

[20]  Xiang-Qun Xie,et al.  Exploiting PubChem for virtual screening , 2010, Expert opinion on drug discovery.

[21]  L. Pardo,et al.  Synthesis and structure-activity relationships of a new model of arylpiperazines. Study of the 5-HT(1a)/alpha(1)-adrenergic receptor affinity by classical hansch analysis, artificial neural networks, and computational simulation of ligand recognition. , 2001, Journal of medicinal chemistry.

[22]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[25]  Janez Bester,et al.  Introduction to the Artificial Neural Networks , 2011 .

[26]  Igor V. Pletnev,et al.  Drug Discovery Using Support Vector Machines. The Case Studies of Drug-likeness, Agrochemical-likeness, and Enzyme Inhibition Predictions , 2003, J. Chem. Inf. Comput. Sci..

[27]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[28]  Lin Wu,et al.  Learning to play Go using recursive neural networks , 2008, Neural Networks.

[29]  Davis,et al.  Principles of Data Mining , 2001 .

[30]  Gunnar Rätsch,et al.  Active Learning with Support Vector Machines in the Drug Discovery Process , 2003, J. Chem. Inf. Comput. Sci..

[31]  Brian Kingsbury,et al.  New types of deep neural network learning for speech recognition and related applications: an overview , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[33]  Afshin Fassihi,et al.  QSAR study of some CCR5 antagonists as anti-HIV agents using radial basis function neural network and general regression neural network on the basis of principal components , 2011, Medicinal Chemistry Research.

[34]  Yihong Gong,et al.  Deep Learning with Kernel Regularization for Visual Recognition , 2008, NIPS.