Guiding Conventional Protein-Ligand Docking Software with Convolutional Neural Networks

The high-performance computational techniques have brought significant benefits for drug discovery efforts in recent decades. One of the most challenging problems in drug discovery is the protein-ligand binding pose prediction. To predict the most stable structure of the complex, the performance of conventional structure-based molecular docking methods heavily depends on the accuracy of scoring or energy functions (as an approximation of affinity) for each pose of the protein-ligand docking complex to effectively guide the search in an exponentially large solution space. However, due to the heterogeneity of molecular structures, the existing scoring calculation methods are either tailored to a particular data set or fail to exhibit high accuracy. In this paper, we propose a convolutional neural network (CNN)-based model that learns to predict the stability factor of the protein-ligand complex and exhibits the ability of CNNs to improve the existing docking software. Evaluated results on PDBbind data set indicate that our approach reduces the execution time of the traditional docking-based method while improving the accuracy. Our code, experiment scripts, and pretrained models are available at https://github.com/j9650/MedusaNet.