Molecular Generation for Desired Transcriptome Changes With Adversarial Autoencoders

Gene expression profiles are useful for assessing the efficacy and side effects of drugs. In this paper, we propose a new generative model that infers drug molecules that could induce a desired change in gene expression. Our model—the Bidirectional Adversarial Autoencoder—explicitly separates cellular processes captured in gene expression changes into two feature sets: those related and unrelated to the drug incubation. The model uses related features to produce a drug hypothesis. We have validated our model on the LINCS L1000 dataset by generating molecular structures in the SMILES format for the desired transcriptional response. In the experiments, we have shown that the proposed model can generate novel molecular structures that could induce a given gene expression change or predict a gene expression difference after incubation of a given molecular structure. The code of the model is available at https://github.com/insilicomedicine/BiAAE.

[1]  Anne E Carpenter,et al.  Opportunities and obstacles for deep learning in biology and medicine , 2017, bioRxiv.

[2]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[3]  Yi Li,et al.  Gene expression inference with deep learning , 2015, bioRxiv.

[4]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[5]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[6]  Honglak Lee,et al.  Deep Variational Canonical Correlation Analysis , 2016, ArXiv.

[7]  Alán Aspuru-Guzik,et al.  Inverse molecular design using machine learning: Generative models for matter engineering , 2018, Science.

[8]  Jinfang Zheng,et al.  Deep-RBPPred: Predicting RNA binding proteins in the proteome scale based on deep learning , 2017, Scientific Reports.

[9]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[10]  Scott E. Reed,et al.  Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis , 2015, NIPS.

[11]  Andrey Alekseenko,et al.  Use of deep neural network ensembles to identify embryonic-fetal transition markers: repression of COX7A1 in embryonic and cancer cells , 2017, Oncotarget.

[12]  Evgeny Putin,et al.  Deep biomarkers of human aging: Application of deep neural networks to biomarker development , 2016, Aging.

[13]  Gisbert Schneider,et al.  Deep Learning in Drug Discovery , 2016, Molecular informatics.

[14]  Evgeny Putin,et al.  Population Specific Biomarkers of Human Aging: A Big Data Study Using South Korean, Canadian, and Eastern European Patient Populations , 2018, The journals of gerontology. Series A, Biological sciences and medical sciences.

[15]  Matthias Zwicker,et al.  Disentangling Factors of Variation by Mixing Them , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[17]  Jianxing Feng,et al.  Imputation for transcription factor binding predictions based on deep learning , 2017, PLoS Comput. Biol..

[18]  Annalisa Marsico,et al.  pysster: classification of biological sequences by learning sequence and structure motifs with convolutional neural networks , 2018, Bioinform..

[19]  Rim Shayakhmetov,et al.  3D Molecular Representations Based on the Wave Transform for Convolutional Neural Networks. , 2018, Molecular pharmaceutics.

[20]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[21]  Andrey Kazennov,et al.  The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology , 2016, Oncotarget.

[22]  Hyunjung Shim,et al.  Improved Training of Generative Adversarial Networks Using Representative Features , 2018, ICML.

[23]  Quan Pan,et al.  Disentangled Variational Auto-Encoder for Semi-supervised Learning , 2017, Inf. Sci..

[24]  A. Aliper,et al.  In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development , 2016, Nature Communications.

[25]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[26]  Yoshua Bengio,et al.  Mutual Information Neural Estimation , 2018, ICML.

[27]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Guillaume Lample,et al.  Fader Networks: Manipulating Images by Sliding Attributes , 2017, NIPS.

[30]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[31]  Xueliang Liu,et al.  Deep Recurrent Neural Network for Protein Function Prediction from Sequence , 2017, bioRxiv.

[32]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[33]  Rama Chellappa,et al.  Semi-supervised FusedGAN for Conditional Image Generation , 2018, ECCV.

[34]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[35]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[36]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[37]  Kumardeep Chaudhary,et al.  Deep Learning–Based Multi-Omics Integration Robustly Predicts Survival in Liver Cancer , 2017, Clinical Cancer Research.

[38]  Simon Osindero,et al.  Conditional Generative Adversarial Nets , 2014, ArXiv.

[39]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[40]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[41]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[42]  Alexander Aliper,et al.  Towards natural mimetics of metformin and rapamycin , 2017, Aging.

[43]  Dmitry Vetrov,et al.  Entangled Conditional Adversarial Autoencoder for de Novo Drug Discovery. , 2018, Molecular pharmaceutics.

[44]  Alán Aspuru-Guzik,et al.  Reinforced Adversarial Neural Computer for de Novo Molecular Design , 2018, J. Chem. Inf. Model..

[45]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[46]  Sergey Plis,et al.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.

[47]  A. Zhavoronkov,et al.  Machine Learning on Human Muscle Transcriptomic Data for Biomarker Discovery and Tissue-Specific Drug Target Identification , 2018, Front. Genet..

[48]  Masahiro Suzuki,et al.  Joint Multimodal Learning with Deep Generative Models , 2016, ICLR.

[49]  Stefano Ermon,et al.  InfoVAE: Balancing Learning and Inference in Variational Autoencoders , 2019, AAAI.

[50]  A. Zhavoronkov Artificial Intelligence for Drug Discovery, Biomarker Development, and Generation of Novel Chemistry. , 2018, Molecular pharmaceutics.

[51]  Jeff A. Bilmes,et al.  On Deep Multi-View Representation Learning , 2015, ICML.

[52]  Evgeny Putin,et al.  Adversarial Threshold Neural Computer for Molecular de Novo Design. , 2018, Molecular pharmaceutics.

[53]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[54]  Sergey Nikolenko,et al.  druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico. , 2017, Molecular pharmaceutics.

[55]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..

[56]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Andrew D. Rouillard,et al.  LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures , 2014, Nucleic Acids Res..

[58]  Evgeny Putin,et al.  Blood Biochemistry Analysis to Detect Smoking Status and Quantify Accelerated Aging in Smokers , 2019, Scientific Reports.

[59]  Bogdan Raducanu,et al.  Invertible Conditional GANs for image editing , 2016, ArXiv.

[60]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[61]  Alexandros G. Dimakis,et al.  CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training , 2017, ICLR.

[62]  Jean-Luc Dugelay,et al.  Face aging with conditional generative adversarial networks , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[63]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[64]  Yann LeCun,et al.  Disentangling factors of variation in deep representation using adversarial training , 2016, NIPS.

[65]  Anil A. Bharath,et al.  Conditional Autoencoders with Adversarial Information Factorization , 2017, ArXiv.

[66]  Annalisa Marsico,et al.  pysster: Learning Sequence and Structure Motifs in DNA and RNA Sequences using Convolutional Neural Networks , 2017, bioRxiv.