Deep Learning Models Compared to Experimental Variability for the Prediction of CYP3A4 Time-Dependent Inhibition.

Most drugs are mainly metabolized by cytochrome P450 (CYP450), which can lead to drug-drug interactions (DDI). Specifically, time-dependent inhibition (TDI) of CYP3A4 isoenzyme has been associated with clinically relevant DDI. To overcome potential DDI issues, high-throughput in vitro assays were established to assess the TDI of CYP3A4 during the discovery and lead optimization phases. However, in silico machine learning models would enable an earlier and larger-scale assessment of TDI potential liabilities. For CYP inhibition, most modeling efforts have focused on highly imbalanced and small data sets. Moreover, assay variability is rarely considered, which is key to understand the model's quality and suitability for decision-making. In this work, machine learning models were built for the prediction of TDI of CYP3A4, evaluated prospectively, and compared to the variability of the experimental assay. Different modeling strategies were investigated to assess their influence on the model's performance. Through multitask learning, additional data sets were leveraged for model building, coming from public databases, in-house CYP-related assays, or other pharmaceutical companies (federated learning). Apart from the numerical prediction of inactivation rates of CYP3A4 TDI, three-class predictions were carried out, giving a negative (inactivation rate kobs < 0.01 min-1), weak positive (0.01 ≤ kobs ≤ 0.025 min-1), or positive (kobs > 0.025 min-1) output. The final multitask graph neural network model achieved misclassification rates of 8 and 7% for positive and negative TDI, respectively. Importantly, the presented deep learning-based predictions had a similar precision to the reproducibility of in vitro experiments and thus offered great opportunities for drug design, early derisk of DDI potential, and selection of experiments. To facilitate CYP inhibition modeling efforts in the public domain, the developed model was used to annotate ∼16 000 publicly available structures, and a surrogate data set is shared as Supporting Information.

[1]  Andrea Hunklinger,et al.  The openOCHEM consensus model is the best-performing open-source predictive model in the First EUOS/SLAS Joint Compound Solubility Challenge. , 2024, SLAS discovery : advancing life sciences R & D.

[2]  Sayash Kapoor,et al.  Leakage and the reproducibility crisis in machine-learning-based science , 2023, Patterns.

[3]  Andrea Volkamer,et al.  Machine Learning for Small Molecule Drug Discovery in Academia and Industry , 2023, Artificial Intelligence in the Life Sciences.

[4]  Weihua Li,et al.  Development of In Silico Models for Predicting Potential Time-Dependent Inhibitors of Cytochrome P450 3A4. , 2022, Molecular pharmaceutics.

[5]  B. Faller,et al.  Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses. , 2022, Molecular pharmaceutics.

[6]  Benjamin A. Shoemaker,et al.  PubChem 2023 update , 2022, Nucleic Acids Res..

[7]  Richard A. Lewis,et al.  Predicting In Vivo Compound Brain Penetration Using Multi-task Graph Neural Networks , 2022, J. Chem. Inf. Model..

[8]  Raquel Rodríguez-Pérez,et al.  Identification of bile salt export pump inhibitors using machine learning: Predictive safety from an industry perspective , 2021, Artificial Intelligence in the Life Sciences.

[9]  J. Kirchmair,et al.  CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes. , 2021, Bioorganic & medicinal chemistry.

[10]  Zhen Wu,et al.  A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility , 2020, Journal of Cheminformatics.

[11]  Chao Shen,et al.  ADMET Evaluation in Drug Discovery. 19. Reliable Prediction of Human Cytochrome P450 Inhibition Using Artificial Intelligence Approaches , 2019, J. Chem. Inf. Model..

[12]  F. Akhlaghi,et al.  Physicochemical Properties, Biotransformation, and Transport Pathways of Established and Newly Approved Medications: A Systematic Review of the Top 200 Most Prescribed Drugs vs. the FDA-Approved Drugs Between 2005 and 2016 , 2019, Clinical Pharmacokinetics.

[13]  Connor W. Coley,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[14]  K. Friedemann Schmidt,et al.  Predictive Multitask Deep Neural Network Models for ADME-Tox Properties: Learning from Large Data Sets , 2019, J. Chem. Inf. Model..

[15]  Jürgen Bajorath,et al.  Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds , 2017, J. Chem. Inf. Model..

[16]  Olivier Michielin,et al.  SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules , 2017, Scientific Reports.

[17]  Y. Tseng,et al.  Rule-Based Prediction Models of Cytochrome P450 Inhibition , 2015, J. Chem. Inf. Model..

[18]  E. Chan,et al.  Mechanism-based inactivation of CYP450 enzymes: a case study of lapatinib , 2015, Drug metabolism reviews.

[19]  Robert P. Sheridan,et al.  Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective Prediction , 2013, J. Chem. Inf. Model..

[20]  Bin Chen,et al.  Comparison of Random Forest and Pipeline Pilot Naïve Bayes in Prospective QSAR Predictions , 2012, J. Chem. Inf. Model..

[21]  Ruili Huang,et al.  Predictive Models for Cytochrome P450 Isozymes Based on Quantitative High Throughput Screening Data , 2011, J. Chem. Inf. Model..

[22]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[23]  B. Faller,et al.  CYP3A Time-Dependent Inhibition Risk Assessment Validated with 400 Reference Drugs , 2011, Drug Metabolism and Disposition.

[24]  Igor V Tetko,et al.  A comparison of different QSAR approaches to modeling CYP450 1A2 inhibition , 2011, J. Chem. Inf. Model..

[25]  Lei Yang,et al.  Classification of Cytochrome P450 Inhibitors and Noninhibitors Using Combined Classifiers , 2011, J. Chem. Inf. Model..

[26]  M. Schwab,et al.  Functional pharmacogenetics/genomics of human cytochromes P450 involved in drug biotransformation , 2008, Analytical and bioanalytical chemistry.

[27]  Klaus-Robert Müller,et al.  Machine learning models for lipophilicity and their domain of applicability. , 2007, Molecular pharmaceutics.

[28]  Robert J Riley,et al.  Time-dependent CYP inhibition , 2007, Expert opinion on drug metabolism & toxicology.

[29]  Rieko Arimoto,et al.  Computational models for predicting interactions with cytochrome p450 enzyme. , 2006, Current topics in medicinal chemistry.

[30]  Aleksandra Galetin,et al.  PREDICTION OF TIME-DEPENDENT CYP3A4 DRUG-DRUG INTERACTIONS: IMPACT OF ENZYME DEGRADATION, PARALLEL ELIMINATION PATHWAYS, AND INTESTINAL INHIBITION , 2006, Drug Metabolism and Disposition.

[31]  Igor V. Tetko,et al.  Surrogate data – a secure way to share corporate data , 2005, J. Comput. Aided Mol. Des..

[32]  Rieko Arimoto,et al.  Development of CYP3A4 Inhibition Models: Comparisons of Machine-Learning Techniques and Molecular Descriptors , 2005, Journal of biomolecular screening.

[33]  S. Wolfe,et al.  Timing of new black box warnings and withdrawals for prescription medications. , 2002, JAMA.

[34]  P. Neuvonen,et al.  Mibefradil but not isradipine substantially elevates the plasma concentrations of the CYP3A4 substrate triazolam , 1999, Clinical pharmacology and therapeutics.

[35]  Robert Tibshirani,et al.  Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .

[36]  F. Guengerich Cytochrome p450 and chemical toxicology. , 2008, Chemical research in toxicology.