The role of different sampling methods in improving biological activity prediction using deep belief network

Thousands of molecules and descriptors are available for a medicinal chemist thanks to the technological advancements in different branches of chemistry. This fact as well as the correlation between them has raised new problems in quantitative structure activity relationship studies. Proper parameter initialization in statistical modeling has merged as another challenge in recent years. Random selection of parameters leads to poor performance of deep neural network (DNN). In this research, deep belief network (DBN) was applied to initialize DNNs. DBN is composed of some stacks of restricted Boltzmann machine, an energy‐based method that requires computing log likelihood gradient for all samples. Three different sampling approaches were suggested to solve this gradient. In this respect, the impact of DBN was applied based on the different sampling approaches mentioned above to initialize the DNN architecture in predicting biological activity of all fifteen Kaggle targets that contain more than 70k molecules. The same as other fields of processing research, the outputs of these models demonstrated significant superiority to that of DNN with random parameters. © 2016 Wiley Periodicals, Inc.

[1]  K. Jiang,et al.  N-Allylic Alkylation of Indoles , 2009 .

[2]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[3]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[4]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[5]  Dong Yu,et al.  Deep Neural Network-Hidden Markov Model Hybrid Systems , 2015 .

[6]  S. Joshua Swamidass,et al.  Modeling Epoxidation of Drug-like Molecules with a Deep Machine Learning Network , 2015, ACS central science.

[7]  Yuhao Wang,et al.  Predicting drug-target interactions using restricted Boltzmann machines , 2013, Bioinform..

[8]  Pascal Vincent,et al.  Quickly Generating Representative Samples from an RBM-Derived Process , 2011, Neural Computation.

[9]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[10]  Afshin Fassihi,et al.  Improving Activity Prediction of Adenosine A2B Receptor Antagonists by Nonlinear Models , 2015, IWBBIO.

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12]  G. Enderlein,et al.  Miller, I., and J. E. Freund: Probability and Statistics for Engineers. Prentice‐Hall, Englewood Cliffs, New Jersey 1965. 432 S., Preis 96. s. , 1968 .

[13]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..

[14]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.