toxCSM: comprehensive prediction of small molecule toxicity profiles

Drug discovery is a lengthy, costly and high-risk endeavour that is further convoluted by high attrition rates in later development stages. Toxicity has been one of the main causes of failure during clinical trials, increasing drug development time and costs. To facilitate early identification and optimisation of toxicity profiles, several computational tools emerged aiming at improving success rates by timely pre-screening drug candidates. Despite these efforts, there is an increasing demand for platforms capable of assessing both environmental as well as human-based toxicity properties at large scale. Here, we present toxCSM, a comprehensive computational platform for the study and optimisation of toxicity profiles of small molecules. toxCSM leverages on the well-established concepts of graph-based signatures, molecular descriptors and similarity scores to develop 36 models for predicting a range of toxicity properties, which can assist in developing safer drugs and agrochemicals. toxCSM achieved an Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) of up to 0.99 and Pearson's correlation coefficients of up to 0.94 on 10-fold cross-validation, with comparable performance on blind test sets, outperforming all alternative methods. toxCSM is freely available as a user-friendly web server and API at http://biosig.lab.uq.edu.au/toxcsm.

[1]  Jianping Lin,et al.  Interpretable-ADMET: a web service for ADMET prediction and optimization based on deep neural representation , 2022, Bioinform..

[2]  J. Goodman,et al.  A review of molecular representation in the age of machine learning , 2022, WIREs Computational Molecular Science.

[3]  U. D. Priyakumar,et al.  Molecular representations for machine learning applications in chemistry , 2021, International Journal of Quantum Chemistry.

[4]  Jason H. Moore,et al.  TargetTox: A Feature Selection Pipeline for Identifying Predictive Targets Associated with Drug Toxicity , 2021, J. Chem. Inf. Model..

[5]  Douglas E. V. Pires,et al.  pdCSM-PPI: Using Graph-Based Signatures to Identify Protein-Protein Interaction Inhibitors , 2021, J. Chem. Inf. Model..

[6]  Douglas E. V. Pires,et al.  kinCSM: using graph-based signatures to predict small molecule CDK2 kinase inhibitors , 2021 .

[7]  Douglas E. V. Pires,et al.  pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties , 2021, J. Chem. Inf. Model..

[8]  Weihua Li,et al.  In silico prediction of chemical respiratory toxicity via machine learning , 2021 .

[9]  Aiping Lu,et al.  ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties , 2021, Nucleic Acids Res..

[10]  Ola Engkvist,et al.  Molecular representations in AI-driven drug discovery: a review and practical guide , 2020, Journal of Cheminformatics.

[11]  David B Ascher,et al.  mycoCSM: Using Graph-Based Signatures to Identify Safe Potent Hits against Mycobacteria , 2020, J. Chem. Inf. Model..

[12]  Carlos H. M. Rodrigues,et al.  mCSM-membrane: predicting the effects of mutations on transmembrane proteins , 2020, Nucleic Acids Res..

[13]  Michael Silk,et al.  EasyVS: a user-friendly web-based tool for molecule library selection and structure-based virtual screening , 2020, Bioinform..

[14]  A. Seyhan Lost in translation: the valley of death across preclinical and clinical divide – identification of problems and overcoming obstacles , 2019, Translational Medicine Communications.

[15]  Gail A. Van Norman,et al.  Limitations of Animal Studies for Predicting Toxicity in Clinical Trials , 2019, JACC. Basic to translational science.

[16]  David B Ascher,et al.  dendPoint: a web resource for dendrimer pharmacokinetics investigation and prediction , 2019, Scientific Reports.

[17]  Weihua Li,et al.  In silico estimation of chemical aquatic toxicity on crustaceans using chemical category methods. , 2018, Environmental science. Processes & impacts.

[18]  Jie Li,et al.  admetSAR 2.0: web‐service for prediction and optimization of chemical ADMET properties , 2018, Bioinform..

[19]  Dong-Sheng Cao,et al.  ADMETlab: a platform for systematic ADMET evaluation based on a comprehensively collected ADMET database , 2018, Journal of Cheminformatics.

[20]  Andreas Eckert,et al.  ProTox-II: a webserver for the prediction of toxicity of chemicals , 2018, Nucleic Acids Res..

[21]  Hongbin Yang,et al.  In silico prediction of chemical genotoxicity using machine learning methods and structural alerts. , 2018, Toxicology research.

[22]  David Lagorce,et al.  FAF‐Drugs4: free ADME‐tox filtering computations for chemical biology and early stages drug discovery , 2017, Bioinform..

[23]  Saeed Alqahtani,et al.  In silico ADME-Tox modeling: progress and prospects , 2017, Expert opinion on drug metabolism & toxicology.

[24]  M. Prunotto,et al.  Opportunities and challenges in phenotypic drug discovery: an industry perspective , 2017, Nature Reviews Drug Discovery.

[25]  Douglas E. V. Pires,et al.  CSM-lig: a web server for assessing and comparing protein–small molecule affinities , 2016, Nucleic Acids Res..

[26]  Alexander Amberg,et al.  Computational Models for Human and Animal Hepatotoxicity with a Global Application Scope. , 2016, Chemical research in toxicology.

[27]  Günter Klambauer,et al.  DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..

[28]  R. M. Owen,et al.  An analysis of the attrition of drug candidates from four major pharmaceutical companies , 2015, Nature Reviews Drug Discovery.

[29]  Douglas E. V. Pires,et al.  pkCSM: Predicting Small-Molecule Pharmacokinetic and Toxicity Properties Using Graph-Based Signatures , 2015, Journal of medicinal chemistry.

[30]  Feixiong Cheng,et al.  In silico prediction of chemical toxicity on avian species using chemical category approaches. , 2015, Chemosphere.

[31]  S. Hyman,et al.  Improving and Accelerating Drug Development for Nervous System Disorders , 2014, Neuron.

[32]  Mathias Dunkel,et al.  ProTox: a web server for the in silico prediction of rodent oral toxicity , 2014, Nucleic Acids Res..

[33]  Andrew D J Pearson,et al.  How can attrition rates be reduced in cancer drug discovery? , 2013, Expert opinion on drug discovery.

[34]  Jie Shen,et al.  admetSAR: A Comprehensive Source and Free Tool for Assessment of Chemical ADMET Properties , 2012, J. Chem. Inf. Model..

[35]  I. Khanna,et al.  Drug discovery in pharmaceutical industry: productivity challenges and trends. , 2012, Drug discovery today.

[36]  Patrick Y. Muller,et al.  The determination and interpretation of the therapeutic index in drug development , 2012, Nature Reviews Drug Discovery.

[37]  L. Hutchinson,et al.  High drug attrition rates—where are we going wrong? , 2011, Nature Reviews Clinical Oncology.

[38]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[39]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[40]  T Lavé,et al.  Challenges and opportunities with modelling and simulation in drug discovery and drug development , 2007, Xenobiotica; the fate of foreign compounds in biological systems.

[41]  J. Demšar Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[42]  A. Bender,et al.  Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. , 2006, IDrugs : the investigational drugs journal.

[43]  Christian Borgelt,et al.  MoSS: a program for molecular substructure mining , 2005 .

[44]  H. van de Waterbeemd,et al.  From in vivo to in vitro/in silico ADME: progress and challenges , 2005, Expert opinion on drug metabolism & toxicology.

[45]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[46]  A. Li,et al.  Screening for human ADME/Tox drug properties in drug discovery. , 2001, Drug discovery today.

[47]  Darko Butina,et al.  Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets , 1999, J. Chem. Inf. Comput. Sci..

[48]  OUP accepted manuscript , 2022, Briefings In Bioinformatics.

[49]  OUP accepted manuscript , 2021, Briefings In Bioinformatics.

[50]  OUP accepted manuscript , 2021, Bioinformatics.

[51]  OUP accepted manuscript , 2021, Bioinformatics Advances.

[52]  Stephanie Portelli,et al.  A Comprehensive Computational Platform to Guide Drug Development Using Graph-Based Signature Methods. , 2020, Methods in molecular biology.

[53]  Lisa M Kaminskas,et al.  Prediction and Optimization of Pharmacokinetic and Toxicity Properties of the Ligand. , 2018, Methods in molecular biology.