Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five regression datasets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produce a particularly reliable ranking of errors across multiple datasets. While we believe these results show that existing UQ methods are not sufficient for all common use cases and further research is needed, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.

[1]  Regina Barzilay,et al.  Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction , 2017, J. Chem. Inf. Model..

[2]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[3]  Robert P. Sheridan,et al.  Similarity to Molecules in the Training Set Is a Good Discriminator for Prediction Accuracy in QSAR , 2004, J. Chem. Inf. Model..

[4]  Artem Cherkasov,et al.  QSAR without borders. , 2020, Chemical Society reviews.

[5]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[6]  N. L. Johnson,et al.  Breakthroughs in Statistics , 1992 .

[7]  Regina Barzilay,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[8]  A. Weigend,et al.  Estimating the mean and variance of the target probability distribution , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[9]  Jürgen Bajorath,et al.  Exploring activity cliffs in medicinal chemistry. , 2012, Journal of medicinal chemistry.

[10]  OpitzDavid,et al.  Popular ensemble methods , 1999 .

[11]  David Wingate,et al.  Graph Neural Processes: Towards Bayesian Graph Neural Networks , 2019, ArXiv.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[14]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[15]  Eric Xing,et al.  Methods for comparing uncertainty quantifications for material property predictions. , 2019 .

[16]  Alpha A. Lee,et al.  Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning , 2019, Chemical science.

[17]  Deli Zhao,et al.  Scalable Gaussian Process Regression Using Deep Neural Networks , 2015, IJCAI.

[18]  Ruifeng Liu,et al.  Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds , 2018, J. Chem. Inf. Model..

[19]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[20]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[21]  J. Dearden,et al.  QSAR modeling: where have you been? Where are you going to? , 2014, Journal of medicinal chemistry.

[22]  Scott Boyer,et al.  Assessment of Machine Learning Reliability Methods for Quantifying the Applicability Domain of QSAR Regression Models , 2014, J. Chem. Inf. Model..

[23]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[24]  Isidro Cortes-Ciriano,et al.  Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Errors for Deep Neural Networks , 2018, Journal of chemical information and modeling.

[25]  Isidro Cortes-Ciriano,et al.  Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout , 2019, J. Chem. Inf. Model..

[26]  Seongok Ryu,et al.  Uncertainty quantification of molecular property prediction with Bayesian neural networks , 2019, ArXiv.

[27]  Charles Blundell,et al.  Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles , 2016, NIPS.

[28]  Seongok Ryu,et al.  A comprehensive study on the prediction reliability of graph neural networks for virtual screening , 2020, ArXiv.

[29]  Klaus-Robert Müller,et al.  Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules , 2007, J. Comput. Aided Mol. Des..

[30]  Alex Alves Freitas,et al.  A novel applicability domain technique for mapping predictive reliability across the chemical space of a QSAR: reliability-density neighbourhood , 2016, Journal of Cheminformatics.

[31]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[32]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[33]  Barbara Pernici,et al.  Evaluating Scalable Uncertainty Estimation Methods for DNN-Based Molecular Property Prediction , 2019, ArXiv.

[34]  H. Kulik,et al.  A Quantitative Uncertainty Metric Controls Error in Neural Network-Driven Chemical Discovery , 2019 .

[35]  Supratik Kar,et al.  On a simple approach for determining applicability domain of QSAR models , 2015 .

[36]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.