Spatial Graph Convolutions for Drug Discovery

Predicting the binding free energy, or affinity, of a small molecule for a protein target is frequently the first step along the arc of drug discovery. High throughput experimental and virtual screening both suffer from low accuracy, whereas more accurate approaches in both domains suffer from lack of scale due to either financial or temporal constraints. While machine learning (ML) has made immense progress in the fields of computer vision and natural language processing, it has yet to offer comparable improvements over domain-expertise driven algorithms in the molecular sciences. In this paper, we propose new Deep Neural Network (DNN) architectures for affinity prediction. The new model architectures are at least competitive with, and in many cases state-of-the-art compared to previous knowledge-based and physics-based approaches. In addition to more standard evaluation metrics, we also propose the Regression Enrichment Factor $EF_\chi^{(R)}$ for the community to benchmark against in future affinity prediction studies. Finally, we suggest the adaptation of an agglomerative clustering cross-validation strategy to more accurately reflect the generalization capacity of ML-based affinity models in future works.