Scaffold-Based Multi-Objective Drug Candidate Optimization

Multiparameter optimization (MPO) provides a means to assess and balance several variables based on their importance to the overall objective. However, using MPO methods in therapeutic discovery is challenging due to the number of cheminformatics properties required to find an optimal solution. High throughput virtual screening to identify hit candidates produces a large amount of data with conflicting properties. For instance, toxicity and binding affinity can contradict each other and cause improbable levels of toxicity that can lead to adverse effects. Instead of using the exhaustive method of treating each property, multiple properties can be combined into a single MPO score, with weights assigned for each property. This desirability score also lends itself well to ML applications that can use the score in the loss function. In this work, we will discuss scaffold focused graph-based Markov chain monte carlo framework built to generate molecules with optimal properties. This framework trains itself on-the-fly with the MPO score of each iteration of molecules, and is able to work on a greater number of properties and sample the chemical space around a starting scaffold. Results are compared to the chemical Transformer model molGCT to judge performance between graph and natural language processing approaches.

[1]  AkshatKumar Nigam,et al.  Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design , 2021, Digital discovery.

[2]  Weinan Zhang,et al.  MARS: Markov Molecular Sampling for Multi-objective Drug Discovery , 2021, ICLR.

[3]  Jonggeol Na,et al.  Generative Chemical Transformer: Neural Machine Learning of Molecular Geometric Structures from Chemical Language via Attention , 2021, J. Chem. Inf. Model..

[4]  Michael L. Waskom,et al.  Seaborn: Statistical Data Visualization , 2021, J. Open Source Softw..

[5]  Yongjin Lee,et al.  Machine Learning Enabled Tailor-Made Design of Application-Specific Metal-Organic Frameworks. , 2019, ACS applied materials & interfaces.

[6]  Blaž Zupan,et al.  openTSNE: a modular Python library for t-SNE dimensionality reduction and embedding , 2019, bioRxiv.

[7]  Stefano E. Rensi,et al.  Machine learning in chemoinformatics and drug discovery. , 2018, Drug discovery today.

[8]  Y. Kluger,et al.  Fast Interpolation-based t-SNE for Improved Visualization of Single-Cell RNA-Seq Data , 2017, Nature Methods.

[9]  Olivier Michielin,et al.  SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules , 2017, Scientific Reports.

[10]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[11]  Matthew D Segall,et al.  Multi-parameter optimization: identifying high quality compounds with a balance of properties. , 2012, Current pharmaceutical design.

[12]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[13]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[14]  G. Derringer,et al.  Simultaneous Optimization of Several Response Variables , 1980 .