Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process

BackgroundQSAR is an established and powerful method for cheap in silico assessment of physicochemical properties and biological activities of chemical compounds. However, QSAR models are rather complex mathematical constructs that cannot easily be interpreted. Medicinal chemists would benefit from practical guidance regarding which molecules to synthesize.Another possible approach is analysis of pairs of very similar molecules, so-called matched molecular pairs (MMPs). Such an approach allows identification of molecular transformations that affect particular activities (e.g. toxicity). In contrast to QSAR, chemical interpretation of these transformations is straightforward. Furthermore, such transformations can give medicinal chemists useful hints for the hit-to-lead optimization process.ResultsThe current study suggests a combination of QSAR and MMP approaches by finding MMP transformations based on QSAR predictions for large chemical datasets. The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of “interesting” transformations that can be used to drive the molecular optimization process. All the methodological developments have been implemented as software products available online as part of OCHEM (http://ochem.eu/).ConclusionsThe prediction-driven MMPs methodology was exemplified by two use cases: modelling of aquatic toxicity and CYP3A4 inhibition. This approach helped us to interpret QSAR models and allowed identification of a number of “significant” molecular transformations that affect the desired properties. This can facilitate drug design as a part of molecular optimization process.Graphical AbstractMolecular matched pairs and transformation graphs facilitate interpretable molecular optimisation process.

[1]  Jitender Verma,et al.  3D-QSAR in drug design--a review. , 2010, Current topics in medicinal chemistry.

[2]  Samuel J. Webb,et al.  Self organising hypothesis networks: a new approach for representing and structuring SAR knowledge , 2014, Journal of Cheminformatics.

[3]  Henry S Rzepa,et al.  Enhancement of the chemical semantic web through the use of InChI identifiers. , 2005, Organic & biomolecular chemistry.

[4]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[5]  Wendy A. Warr,et al.  Many InChIs and quite some feat , 2015, Journal of Computer-Aided Molecular Design.

[6]  Ruili Huang,et al.  Comprehensive Characterization of Cytochrome P450 Isozyme Selectivity across Chemical Libraries , 2009, Nature Biotechnology.

[7]  Jameed Hussain,et al.  Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets , 2010, J. Chem. Inf. Model..

[8]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[9]  Stephan C. Schürer,et al.  Prospective Exploration of Synthetically Feasible, Medicinally Relevant Chemical Space , 2005, J. Chem. Inf. Model..

[10]  Igor V. Tetko,et al.  Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set , 2010, J. Chem. Inf. Model..

[11]  Michael M. Hann,et al.  RECAP-Retrosynthetic Combinatorial Analysis Procedure: A Powerful New Technique for Identifying Privileged Molecular Fragments with Useful Applications in Combinatorial Chemistry , 1998, J. Chem. Inf. Comput. Sci..

[12]  Igor V. Tetko,et al.  The perspectives of computational chemistry modeling , 2011, Journal of Computer-Aided Molecular Design.

[13]  Sereina Riniker,et al.  Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods , 2013, Journal of Cheminformatics.

[14]  Markus Hartenfeller,et al.  A Collection of Robust Organic Synthesis Reactions for In Silico Molecule Design , 2011, J. Chem. Inf. Model..

[15]  Paul Krause,et al.  Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity , 2014, Journal of Cheminformatics.

[16]  I. Tetko,et al.  Applicability domain for in silico models to achieve accuracy of experimental measurements , 2010 .

[17]  Eugene N Muratov,et al.  Universal Approach for Structural Interpretation of QSAR/QSPR Models , 2013, Molecular informatics.

[18]  F. Guengerich,et al.  Update information on drug metabolism systems--2009, part II: summary of information on the effects of diseases and environmental factors on human cytochrome P450 (CYP) enzymes and transporters. , 2010, Current drug metabolism.

[19]  Igor V. Tetko,et al.  Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information , 2011, J. Comput. Aided Mol. Des..

[20]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[21]  Rajni Garg,et al.  Mechanism-based QSAR approach to the study of the toxicity of endocrine active substances , 2003 .

[22]  Barry C. Jones,et al.  DRUG-DRUG INTERACTIONS FOR UDP-GLUCURONOSYLTRANSFERASE SUBSTRATES: A PHARMACOKINETIC EXPLANATION FOR TYPICALLY OBSERVED LOW EXPOSURE (AUCI/AUC) RATIOS , 2004, Drug Metabolism and Disposition.

[23]  Sergii Novotarskyi,et al.  QSAR approaches to predict human cytochrome P450 inhibition , 2013 .

[24]  T. Huynh-Dinh,et al.  The logic of chemical synthesis , 1996 .

[25]  Igor V. Tetko,et al.  ToxAlerts: A Web Server of Structural Alerts for Toxic Chemicals and Compounds with Potential Adverse Reactions , 2012, J. Chem. Inf. Model..

[26]  H. Mewes,et al.  Can we estimate the accuracy of ADME-Tox predictions? , 2006, Drug discovery today.

[27]  Igor V. Tetko,et al.  Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection , 2008, J. Chem. Inf. Model..

[28]  F. Jerry Reen,et al.  Emerging Concepts Promising New Horizons for Marine Biodiscovery and Synthetic Biology , 2015, Marine drugs.

[29]  S. Ekins,et al.  Progress in predicting human ADME parameters in silico. , 2000, Journal of pharmacological and toxicological methods.

[30]  Andrew G. Leach,et al.  Matched molecular pair analysis in drug discovery. , 2013, Drug discovery today.

[31]  W. Tong,et al.  Quantitative structure‐activity relationship methods: Perspectives on drug discovery and toxicology , 2003, Environmental toxicology and chemistry.

[32]  Shane Weaver,et al.  The importance of the domain of applicability in QSAR modeling. , 2008, Journal of molecular graphics & modelling.

[33]  Igor V. Tetko,et al.  Combinatorial QSAR Modeling of Chemical Toxicants Tested against Tetrahymena pyriformis , 2008, J. Chem. Inf. Model..

[34]  Daniel J. Warner,et al.  Matched molecular pairs as a medicinal chemistry tool. , 2011, Journal of medicinal chemistry.

[35]  Daniel Svozil,et al.  Molpher: a software framework for systematic chemical space exploration , 2014, Journal of Cheminformatics.

[36]  Slobodan Petar Rendic Summary of information on human CYP enzymes: human P450 metabolism data , 2002, Drug metabolism reviews.

[37]  David S Wishart,et al.  DrugBank and its relevance to pharmacogenomics. , 2008, Pharmacogenomics.

[38]  Igor V. Tetko,et al.  Associative Neural Network , 2002, Neural Processing Letters.