Generative models are being increasingly used in drug discovery, very often coupled with absorption, distribution, metabolism, and excretion (ADME) bioassays or quantitative structure-activity relationship (QSAR) models to optimize a given set of properties. The molecules proposed by these algorithms are often revealed to be false positives; that is, they are predicted to be active and turn out to be inactive after synthesis and testing, mostly due to overoptimization of the predicted scores, which leads to an actual decrease or stagnation of the real scores. This behavior is also known as the "hacking" of the predictive models by the generative model during the optimization step. This issue is reminiscent of adversarial examples in machine learning and it can be seen as enunciated by Goodhart's law: "when a measure becomes a target, it ceases to be a good measure." This issue is even more apparent in a multiparameter optimization (MPO) case, where the models need to extrapolate outside the training set distribution because there are no known molecules satisfying all the objectives simultaneously in the initial training set. Experimental evaluation of this problem is a hard and expensive task since it requires synthesis and testing of the generated molecules. Thus, efforts have been made to develop in silico "oracles"─real-valued functions used as proxies for molecular properties─to help with the evaluation of these generative-model-based pipelines. However, these oracles have had a limited value so far because they are often too easy to model in comparison with biological assays and are usually limited to mono-objective cases. In this work, we introduce a simulator of multitarget assays using a smartly initialized neural network (NN) that returns continuous values for any input molecule. We use this oracle to replicate a real-world prospective lead optimization (LO) scenario. First, we trained predictive models on an initial small sample of molecules aimed at predicting their oracle values. Afterward, we generated new optimized molecules using the open-source GuacaMol package coupled with the previously built predictive models. Finally, we selected compounds matching the candidate drug target profile (CDTP) according to the predicted values and evaluated them by computing the true oracle values. We observed that even when the predictive models had excellent estimated performance metrics, the final selection still contained multiple false positives according to the NN-based oracle. Then, we evaluated the optimization behavior in mono- and bi-objective scenarios using either a logistic regression or a random forest predictive model. We also propose and evaluate several methods to help mitigate the hacking issue.
[1]
Maxime Langevin,et al.
Explaining and avoiding failure modes in goal-directed generation of small molecules
,
2022,
Journal of Cheminformatics.
[2]
Steven Kearnes,et al.
Pursuing a Prospective Perspective
,
2020,
ArXiv.
[3]
Sepp Hochreiter,et al.
On failure modes in molecule generation and optimization.
,
2019,
Drug discovery today. Technologies.
[4]
Alán Aspuru-Guzik,et al.
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
,
2018,
Frontiers in Pharmacology.
[5]
Marwin H. S. Segler,et al.
GuacaMol: Benchmarking Models for De Novo Molecular Design
,
2018,
J. Chem. Inf. Model..
[6]
Risto Miikkulainen,et al.
The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities
,
2018,
Artificial Life.
[7]
Thierry Kogej,et al.
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks
,
2017,
ACS central science.
[8]
George Papadatos,et al.
The ChEMBL database in 2017
,
2016,
Nucleic Acids Res..
[9]
George Papadatos,et al.
SureChEMBL: a large-scale, chemically annotated patent document database
,
2015,
Nucleic Acids Res..
[10]
Geoffrey E. Hinton,et al.
Deep Learning
,
2015,
Nature.
[11]
G. V. Paolini,et al.
Quantifying the chemical beauty of drugs.
,
2012,
Nature chemistry.
[12]
James G. Nourse,et al.
Reoptimization of MDL Keys for Use in Drug Discovery
,
2002,
J. Chem. Inf. Comput. Sci..
[13]
S. Hochreiter,et al.
Long Short-Term Memory
,
1997,
Neural Computation.
[14]
Robert D Clark,et al.
Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors.
,
1996,
Journal of medicinal chemistry.
[15]
H. L. Morgan.
The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service.
,
1965
.