Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout

While the use of deep learning in drug discovery is gaining increasing attention, the lack of methods to compute reliable errors in prediction for Neural Networks prevents their application to guide decision making in domains where identifying unreliable predictions is essential, e.g., precision medicine. Here, we present a framework to compute reliable errors in prediction for Neural Networks using Test-Time Dropout and Conformal Prediction. Specifically, the algorithm consists of training a single Neural Network using dropout, and then applying it N times to both the validation and test sets, also employing dropout in this step. Therefore, for each instance in the validation and test sets an ensemble of predictions are generated. The residuals and absolute errors in prediction for the validation set are then used to compute prediction errors for the test set instances using Conformal Prediction. We show using 24 bioactivity data sets from ChEMBL 23 that Dropout Conformal Predictors are valid (i.e., the fraction of instances whose true value lies within the predicted interval strongly correlates with the confidence level) and efficient, as the predicted confidence intervals span a narrower set of values than those computed with Conformal Predictors generated using Random Forest (RF) models. Lastly, we show in retrospective virtual screening experiments that dropout and RF-based Conformal Predictors lead to comparable retrieval rates of active compounds. Overall, we propose a computationally efficient framework (as only N extra forward passes are required in addition to training a single network) to harness Test-Time Dropout and the Conformal Prediction framework, which is generally applicable to generate reliable prediction errors for Deep Neural Networks in drug discovery and beyond.

[1]  Isidro Cortes-Ciriano,et al.  Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel , 2015, Bioinform..

[2]  Henrik Boström,et al.  Interpretable regression trees using conformal prediction , 2018, Expert Syst. Appl..

[3]  Andrew R. Leach,et al.  Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery , 2019, Journal of Cheminformatics.

[4]  Andreas Bender,et al.  Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening , 2018, J. Chem. Inf. Model..

[5]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[6]  Gianni De Fabritiis,et al.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks , 2018, J. Chem. Inf. Model..

[7]  Alexander Tropsha,et al.  Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research , 2010, J. Chem. Inf. Model..

[8]  Lars Carlsson,et al.  QSAR with experimental and predictive distributions: an information theoretic approach for assessing model quality , 2013, Journal of Computer-Aided Molecular Design.

[9]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[10]  Frank R. Burden,et al.  Quantitative Structure-Activity Relationship Studies Using Gaussian Processes , 2001, J. Chem. Inf. Comput. Sci..

[11]  Ola Spjuth,et al.  Efficient iterative virtual screening with Apache Spark and conformal prediction , 2018, Journal of Cheminformatics.

[12]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[13]  Aristotelis Tsirigos,et al.  Classification and Mutation Prediction from Non-Small Cell Lung Cancer Histopathology Images using Deep Learning , 2017, bioRxiv.

[14]  Robert P. Sheridan,et al.  Three Useful Dimensions for Domain Applicability in QSAR Models Using Random Forest , 2012, J. Chem. Inf. Model..

[15]  George Papadatos,et al.  ChEMBL web services: streamlining access to drug discovery data and utilities , 2015, Nucleic Acids Res..

[16]  Robert P. Sheridan,et al.  The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity , 2015, J. Chem. Inf. Model..

[17]  Abhinav Vishnu,et al.  Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models , 2017, ArXiv.

[18]  George Papadatos,et al.  Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set , 2017, bioRxiv.

[19]  Scott Boyer,et al.  Introducing Conformal Prediction in Predictive Modeling. A Transparent and Flexible Alternative to Applicability Domain Determination , 2014, J. Chem. Inf. Model..

[20]  Roger A. Sayle,et al.  Comparing structural fingerprints using a literature-based similarity benchmark , 2016, Journal of Cheminformatics.

[21]  Yoshihiro Yamanishi,et al.  A Distance-Based Boolean Applicability Domain for Classification of High Throughput Screening Data , 2019, J. Chem. Inf. Model..

[22]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[23]  Klaus-Robert Müller,et al.  Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules , 2007, J. Comput. Aided Mol. Des..

[24]  Gábor Csányi,et al.  Gaussian Processes: A Method for Automatic QSAR Modeling of ADME Properties , 2007, J. Chem. Inf. Model..

[25]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[26]  Andreas Bender,et al.  How Consistent are Publicly Reported Cytotoxicity Data? Large‐Scale Statistical Analysis of the Concordance of Public Independent Cytotoxicity Measurements , 2016, ChemMedChem.

[27]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[28]  Isidro Cortes-Ciriano,et al.  KekuleScope: improved prediction of cancer cell line sensitivity using convolutional neural networks trained on compound images , 2018, ArXiv.

[29]  Benedict W J Irwin,et al.  Imputation of Assay Bioactivity Data Using Deep Learning , 2019, J. Chem. Inf. Model..

[30]  Bernd Beck,et al.  QM/NN QSPR Models with Error Estimation: Vapor Pressure and LogP , 2000, J. Chem. Inf. Comput. Sci..

[31]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[32]  Robert P. Sheridan,et al.  Using Random Forest To Model the Domain Applicability of Another Random Forest Model , 2013, J. Chem. Inf. Model..

[33]  F. Tian,et al.  Modeling and prediction of binding affinities between the human amphiphysin SH3 domain and its peptide ligands using genetic algorithm‐Gaussian processes , 2008, Biopolymers.

[34]  Vladimir Vovk,et al.  A tutorial on conformal prediction , 2007, J. Mach. Learn. Res..

[35]  Andreas Bender,et al.  Improving Screening Efficiency through Iterative Screening Using Docking and Conformal Prediction , 2017, J. Chem. Inf. Model..

[36]  Lars Carlsson,et al.  Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets. , 2017, Journal of chemical information and modeling.

[37]  Ola Spjuth,et al.  Conformal Regression for Quantitative Structure-Activity Relationship Modeling - Quantifying Prediction Uncertainty , 2018, J. Chem. Inf. Model..

[38]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[39]  Alexios Koutsoukas,et al.  Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data , 2017, Journal of Cheminformatics.

[40]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[41]  Nicholay Topin,et al.  Exploring loss function topology with cyclical learning rates , 2017, ArXiv.

[42]  A. Vulpetti,et al.  Comparability of Mixed IC50 Data – A Statistical Analysis , 2013, PloS one.

[43]  Leslie N. Smith,et al.  Cyclical Learning Rates for Training Neural Networks , 2015, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[44]  R. Altman,et al.  Association of Omics Features with Histopathology Patterns in Lung Adenocarcinoma. , 2017, Cell systems.

[45]  Sepp Hochreiter,et al.  Accurate Prediction of Biological Assays with High-Throughput Microscopy Images and Convolutional Networks , 2019, J. Chem. Inf. Model..

[46]  Didier Rognan,et al.  Assessing the Scaffold Diversity of Screening Libraries , 2006, J. Chem. Inf. Model..

[47]  Anne E Carpenter,et al.  Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery. , 2018, Cell chemical biology.

[48]  Isidro Cortes-Ciriano,et al.  Proteochemometric modeling in a Bayesian framework , 2014, Journal of Cheminformatics.

[49]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[50]  K. Gnana Sheela,et al.  Review on Methods to Fix Number of Hidden Neurons in Neural Networks , 2013 .

[51]  Alpha A. Lee,et al.  Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning , 2019, Chemical science.

[52]  George Papadatos,et al.  Want Drugs? Use Python , 2016, ArXiv.

[53]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[54]  Ruifeng Liu,et al.  Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds , 2018, J. Chem. Inf. Model..

[55]  Matthew D. Segall,et al.  The challenges of making decisions using uncertain data , 2015, Journal of Computer-Aided Molecular Design.

[56]  Isidro Cortes-Ciriano,et al.  Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Errors for Deep Neural Networks , 2018, Journal of chemical information and modeling.