Applications of Deep-Learning in Exploiting Large-Scale and Heterogeneous Compound Data in Industrial Pharmaceutical Research

In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.

[1]  Yufeng Zhai,et al.  An Automatic Quality Control Pipeline for High-Throughput Screening Hit Identification , 2016, Journal of biomolecular screening.

[2]  Ola Spjuth,et al.  Transfer Learning with Deep Convolutional Neural Networks for Classifying Cellular Morphological Changes , 2018, bioRxiv.

[3]  Andreas Bender,et al.  Target prediction utilising negative bioactivity data covering large chemical space , 2015, Journal of Cheminformatics.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Leroy Cronin,et al.  Using Evolutionary Algorithms and Machine Learning to Explore Sequence Space for the Discovery of Antimicrobial Peptides , 2018 .

[6]  Michael J. Keiser,et al.  Relating protein pharmacology by ligand chemistry , 2007, Nature Biotechnology.

[7]  J Willem M Nissink,et al.  Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017—Utility and Limitations , 2017, ACS chemical biology.

[8]  George Papadatos,et al.  ChEMBL web services: streamlining access to drug discovery data and utilities , 2015, Nucleic Acids Res..

[9]  Igor V. Tetko,et al.  ToxAlerts: A Web Server of Structural Alerts for Toxic Chemicals and Compounds with Potential Adverse Reactions , 2012, J. Chem. Inf. Model..

[10]  Johannes Kirchmair,et al.  Hit Dexter 2.0: Machine-Learning Models for the Prediction of Frequent Hitters , 2019, J. Chem. Inf. Model..

[11]  Sinno Jialin Pan Transfer Learning , 2020, Data Classification: Algorithms and Applications.

[12]  Edward O. Pyzer-Knapp,et al.  Bayesian optimization for accelerated drug discovery , 2018, IBM J. Res. Dev..

[13]  Lassi Paavolainen,et al.  Data-analysis strategies for image-based cell profiling , 2017, Nature Methods.

[14]  Esben Jannik Bjerrum,et al.  Improving Chemical Autoencoder Latent Space and Molecular De Novo Generation Diversity with Heteroencoders , 2018, Biomolecules.

[15]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[16]  Abhinav Vishnu,et al.  Chemception: A Deep Neural Network with Minimal Chemistry Knowledge Matches the Performance of Expert-developed QSAR/QSPR Models , 2017, ArXiv.

[17]  George Papadatos,et al.  Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set , 2017, bioRxiv.

[18]  B. Beck BioProfile--extract knowledge from corporate databases to assess cross-reactivities of compounds. , 2012, Bioorganic & medicinal chemistry.

[19]  Christophe Zimmer,et al.  Deep learning massively accelerates super-resolution localization microscopy , 2018, Nature Biotechnology.

[20]  John A. Tallarico,et al.  Multi-parameter phenotypic profiling: using cellular effects to characterize small-molecule compounds , 2009, Nature Reviews Drug Discovery.

[21]  Navdeep Jaitly,et al.  Multi-task Neural Networks for QSAR Predictions , 2014, ArXiv.

[22]  Stu Borman REDUCING TIME TO DRUG DISCOVERY , 1999 .

[23]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[24]  G. Sermonti The human genome. , 1988, Rivista di biologia.

[25]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[26]  Sepp Hochreiter,et al.  Accurate Prediction of Biological Assays with High-Throughput Microscopy Images and Convolutional Networks , 2019, J. Chem. Inf. Model..

[27]  Alán Aspuru-Guzik,et al.  Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models , 2017, ArXiv.

[28]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[29]  Thierry Kogej,et al.  Making every SAR point count: the development of Chemistry Connect for the large-scale integration of structure and bioactivity data. , 2011, Drug discovery today.

[30]  R. Tennant,et al.  Classification according to chemical structure, mutagenicity to Salmonella and level of carcinogenicity of a further 39 chemicals tested for carcinogenicity by the U.S. National Toxicology Program. , 1991, Mutation research.

[31]  W. Guida,et al.  The art and practice of structure‐based drug design: A molecular modeling perspective , 1996, Medicinal research reviews.

[32]  Yang Li,et al.  PotentialNet for Molecular Property Prediction , 2018, ACS central science.

[33]  Sunghwan Kim,et al.  Getting the most out of PubChem for virtual screening , 2016, Expert opinion on drug discovery.

[34]  M. Murcko,et al.  Chemogenomic approaches to drug discovery. , 2001, Current opinion in chemical biology.

[35]  Lars Carlsson,et al.  Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models , 2018, J. Chem. Inf. Model..

[36]  Igor V. Tetko,et al.  Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process , 2014, Journal of Cheminformatics.

[37]  D. Roberts,et al.  Skin sensitization structure-activity relationships for phenyl benzoates. , 1994, Toxicology in vitro : an international journal published in association with BIBRA.

[38]  Daniel James,et al.  Lessons Learnt from Assembling Screening Libraries for Drug Discovery for Neglected Diseases , 2007, ChemMedChem.

[39]  Stephan Hoyer,et al.  Assessing microscope image focus quality with deep learning , 2018, BMC Bioinformatics.

[40]  David M. Rocke,et al.  Predicting ligand binding to proteins by affinity fingerprinting. , 1995, Chemistry & biology.

[41]  Luca Maria Gambardella,et al.  Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.

[42]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[43]  Anne Mai Wassermann,et al.  The opportunities of mining historical and collective data in drug discovery. , 2015, Drug discovery today.

[44]  Jun Sese,et al.  Compound‐protein interaction prediction with end‐to‐end learning of neural networks for graphs and sequences , 2018, Bioinform..

[45]  Marwin H. S. Segler,et al.  GuacaMol: Benchmarking Models for De Novo Molecular Design , 2018, J. Chem. Inf. Model..

[46]  Ruili Huang,et al.  A Data Analysis Pipeline Accounting for Artifacts in Tox21 Quantitative High-Throughput Screening Assays , 2015, Journal of biomolecular screening.

[47]  Wendy A Warr,et al.  A Short Review of Chemical Reaction Database Systems, Computer‐Aided Synthesis Design, Reaction Prediction and Synthetic Feasibility , 2014, Molecular informatics.

[48]  Günter Klambauer,et al.  DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..

[49]  Gisbert Schneider,et al.  Computer-based de novo design of drug-like molecules , 2005, Nature Reviews Drug Discovery.

[50]  Anne E Carpenter,et al.  Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery. , 2018, Cell chemical biology.

[51]  Jürgen Bajorath,et al.  Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability , 2019, Journal of Cheminformatics.

[52]  Connor W. Coley,et al.  Machine Learning in Computer-Aided Synthesis Planning. , 2018, Accounts of chemical research.

[53]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.

[54]  Thomas Blaschke,et al.  Exploring the GDB-13 chemical space using deep generative models , 2018, Journal of Cheminformatics.

[55]  A. Bender,et al.  Analysis of Iterative Screening with Stepwise Compound Selection Based on Novartis In-house HTS Data. , 2016, ACS chemical biology.

[56]  John S. Schreck,et al.  Learning Retrosynthetic Planning through Simulated Experience , 2019, ACS central science.

[57]  Joseph Gomes,et al.  MoleculeNet: a benchmark for molecular machine learning† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02664a , 2017, Chemical science.

[58]  Alexandre Varnek,et al.  Automatized Assessment of Protective Group Reactivity: A Step Toward Big Reaction Data Analysis , 2016, J. Chem. Inf. Model..

[59]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles , 2017, Cell.

[60]  Alpha A. Lee,et al.  Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning , 2019, Chemical science.

[61]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[62]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..

[63]  Yuan Wang,et al.  Using Information from Historical High-Throughput Screens to Predict Active Compounds , 2014, J. Chem. Inf. Model..

[64]  Thomas Blaschke,et al.  Molecular de-novo design through deep reinforcement learning , 2017, Journal of Cheminformatics.

[65]  Hai Su,et al.  Robust Cell Detection and Segmentation in Histopathological Images Using Sparse Reconstruction and Stacked Denoising Autoencoders , 2015, MICCAI.

[66]  Gisbert Schneider,et al.  Deep Learning in Drug Discovery , 2016, Molecular informatics.

[67]  Daniel C. Elton,et al.  Deep learning for molecular generation and optimization - a review of the state of the art , 2019, Molecular Systems Design & Engineering.

[68]  Christopher A. Hunter,et al.  Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction , 2018, ACS central science.

[69]  Lorenz C. Blum,et al.  970 million druglike small molecules for virtual screening in the chemical universe database GDB-13. , 2009, Journal of the American Chemical Society.

[70]  Sergey I. Bozhevolnyi,et al.  Conference on Lasers and Electro-Optics, 2008 and 2008 Conference on Quantum Electronics and Laser Science. CLEO/QELS 2008 , 2008 .

[71]  J. Kishimoto,et al.  Identification of novel hair‐growth inducers by means of connectivity mapping , 2010, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[72]  Thomas Blaschke,et al.  The rise of deep learning in drug discovery. , 2018, Drug discovery today.

[73]  Yousef M. Abul-Haija,et al.  Exploring Sequence Space for Antimicrobial Peptides using Evolutionary Algorithms and Machine Learning , 2018 .

[74]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[75]  Andrew C. Good,et al.  An Empirical Process for the Design of High-Throughput Screening Deck Filters , 2006, J. Chem. Inf. Model..

[76]  Andrew Janowczyk,et al.  Stain Normalization using Sparse AutoEncoders (StaNoSA): Application to digital pathology , 2017, Comput. Medical Imaging Graph..

[77]  Thierry Kogej,et al.  Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks , 2017, ACS central science.

[78]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[79]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[80]  Tudor I. Oprea,et al.  Badapple: promiscuity patterns from noisy evidence , 2016, Journal of Cheminformatics.

[81]  Ola Engkvist,et al.  Computational prediction of chemical reactions: current status and outlook. , 2018, Drug discovery today.

[82]  William H. Green,et al.  Using Machine Learning To Predict Suitable Conditions for Organic Reactions , 2018, ACS central science.

[83]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[84]  Alexander A. Morgan,et al.  Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data , 2011, Science Translational Medicine.

[85]  Andrew R. Leach,et al.  Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery , 2019, Journal of Cheminformatics.

[86]  Anne E Carpenter,et al.  Reconstructing cell cycle and disease progression using deep learning , 2017, Nature Communications.

[87]  Alexios Koutsoukas,et al.  Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data , 2017, Journal of Cheminformatics.

[88]  Yanli Wang,et al.  PubChem BioAssay: 2014 update , 2013, Nucleic Acids Res..

[89]  Lirong Wang,et al.  TargetHunter: An In Silico Target Identification Tool for Predicting Therapeutic Potential of Small Organic Molecules Based on Chemogenomic Database , 2013, The AAPS Journal.

[90]  Nils-Ole Friedrich,et al.  Hit Dexter: A Machine‐Learning Model for the Prediction of Frequent Hitters , 2018, ChemMedChem.

[91]  Xiang Yao,et al.  Advanced Biological and Chemical Discovery (ABCD): Centralizing Discovery Knowledge in an Inherently Decentralized World , 2007, J. Chem. Inf. Model..

[92]  Regina Barzilay,et al.  Analyzing Learned Molecular Representations for Property Prediction , 2019, J. Chem. Inf. Model..

[93]  Samuel J. Yang,et al.  In Silico Labeling: Predicting Fluorescent Labels in Unlabeled Images , 2018, Cell.

[94]  A. Ozcan,et al.  Deep learning enables cross-modality super-resolution in fluorescence microscopy , 2018, Nature Methods.

[95]  Alán Aspuru-Guzik,et al.  Reinforced Adversarial Neural Computer for de Novo Molecular Design , 2018, J. Chem. Inf. Model..

[96]  Ata Mahjoubfar,et al.  Deep Learning in Label-free Cell Classification , 2016, Scientific Reports.

[97]  John A. Tallarico,et al.  Integrating high-content screening and ligand-target prediction to identify mechanism of action. , 2008, Nature chemical biology.

[98]  Christopher P Austin,et al.  Quantitative analyses of aggregation, autofluorescence, and reactivity artifacts in a screen for inhibitors of a thiol protease. , 2010, Journal of medicinal chemistry.

[99]  Daniela Gabriel,et al.  Linking phenotypes and modes of action through high-content screen fingerprints. , 2015, Assay and drug development technologies.

[100]  Jing Li,et al.  Representing high throughput expression profiles via perturbation barcodes reveals compound targets , 2016, PLoS Comput. Biol..

[101]  Paul A Clemons,et al.  The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease , 2006, Science.

[102]  Marwin H. S. Segler,et al.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. , 2017, Chemistry.

[103]  Hugo Ceulemans,et al.  Large-scale comparison of machine learning methods for drug target prediction on ChEMBL , 2018, Chemical science.

[104]  D A Scudiero,et al.  Display and analysis of patterns of differential activity of drugs against human tumor cell lines: development of mean graph and COMPARE algorithm. , 1989, Journal of the National Cancer Institute.

[105]  武田 一哉,et al.  Recurrent Neural Networkに基づく日常生活行動認識 , 2016 .

[106]  Steven L. Dixon,et al.  Bioactive Diversity and Screening Library Selection via Affinity Fingerprinting , 1998, J. Chem. Inf. Comput. Sci..

[107]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[108]  Ola Engkvist,et al.  Randomized SMILES strings improve the quality of molecular generative models , 2019, Journal of Cheminformatics.

[109]  Wayne C. Guida,et al.  The Art and Practice of Structure‐Based Drug Design: A Molecular Modeling Perspective , 1996 .

[110]  Peter G. Schultz,et al.  In silico activity profiling reveals the mechanism of action of antimalarials discovered in a high-throughput screen , 2008, Proceedings of the National Academy of Sciences.

[111]  G. V. Paolini,et al.  Global mapping of pharmacological space , 2006, Nature Biotechnology.

[112]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[113]  Thilo A Fligge,et al.  Integration of a rapid automated solubility classification into early validation of hits obtained by high throughput screening. , 2006, Journal of pharmaceutical and biomedical analysis.

[114]  Jayme L. Dahlin,et al.  PAINS in the Assay: Chemical Mechanisms of Assay Interference and Promiscuous Enzymatic Inhibition Observed during a Sulfhydryl-Scavenging HTS , 2015, Journal of medicinal chemistry.

[115]  Gisbert Schneider,et al.  Recurrent Neural Network Model for Constructive Peptide Design , 2018, J. Chem. Inf. Model..

[116]  Esben Jannik Bjerrum,et al.  SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules , 2017, ArXiv.

[117]  Alán Aspuru-Guzik,et al.  Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models , 2018, Frontiers in Pharmacology.

[118]  A. Hillisch,et al.  Rendezvous in chemical space? Comparing the small molecule compound libraries of Bayer and Schering. , 2011, Drug discovery today.

[119]  M. C. Jones,et al.  E. Fix and J.L. Hodges (1951): An Important Contribution to Nonparametric Discriminant Analysis and Density Estimation: Commentary on Fix and Hodges (1951) , 1989 .

[120]  Tomer Michaeli,et al.  Deep-STORM: super-resolution single-molecule microscopy by deep learning , 2018, 1801.09631.

[121]  Marc Bickle,et al.  The beautiful cell: high-content screening in drug discovery , 2010, Analytical and bioanalytical chemistry.

[122]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[123]  Clara D. Christ,et al.  Mining Electronic Laboratory Notebooks: Analysis, Retrosynthesis, and Reaction Based Enumeration , 2012, J. Chem. Inf. Model..

[124]  I Kimber,et al.  Skin sensitization , 2015, Human & experimental toxicology.

[125]  Angela N. Brooks,et al.  A Next Generation Connectivity Map: L1000 Platform And The First 1,000,000 Profiles , 2017 .

[126]  Xin Liu,et al.  All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays , 2019, J. Chem. Inf. Model..

[127]  Constantine Bekas,et al.  “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models† †Electronic supplementary information (ESI) available: Time-split test set and example predictions, together with attention weights, confidence and token probabilities. See DO , 2017, Chemical science.

[128]  Stephen R. Heller,et al.  InChI, the IUPAC International Chemical Identifier , 2015, Journal of Cheminformatics.

[129]  Esben Jannik Bjerrum,et al.  Data Augmentation of Spectral Data for Convolutional Neural Network (CNN) Based Deep Chemometrics , 2017, ArXiv.

[130]  Claudia Cappelli,et al.  CPANNatNIC software for counter-propagation neural network to assist in read-across , 2017, Journal of Cheminformatics.

[131]  Lani F. Wu,et al.  Image-based multivariate profiling of drug responses from single cells , 2007, Nature Methods.

[132]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[133]  Yolanda T. Chong,et al.  Automated analysis of high‐content microscopy data with deep learning , 2017, Molecular systems biology.

[134]  Ola Engkvist,et al.  A de novo molecular generation method using latent vector based generative adversarial network , 2019, J. Cheminformatics.

[135]  Gisbert Schneider,et al.  De Novo Design of Bioactive Small Molecules by Artificial Intelligence , 2018, Molecular informatics.

[136]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[137]  Y. Martin,et al.  Do structurally similar molecules have similar biological activity? , 2002, Journal of medicinal chemistry.

[138]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[139]  De WolfHans,et al.  Transcriptional Characterization of Compounds: Lessons Learned from the Public LINCS Data. , 2016 .

[140]  Stu Borman,et al.  REDUCING TIME TO DRUG DISCOVERY: Recent advances in solid-phase synthesis, informatics, and high-throughput screening suggest combinatorial chemistry is coming of a , 1999 .

[141]  S. Muresan,et al.  Chemical predictive modelling to improve compound quality , 2013, Nature Reviews Drug Discovery.

[142]  Lei Jia,et al.  Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction , 2018, International journal of molecular sciences.

[143]  Daniel M. Lowe,et al.  Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists' Bread and Butter. , 2016, Journal of medicinal chemistry.

[144]  M. Boutros,et al.  Microscopy-Based High-Content Screening , 2015, Cell.

[145]  T. Reilly THE PREPARATION OF LIDOCAINE , 1999 .

[146]  Isidro Cortes-Ciriano,et al.  Reliable Prediction Errors for Deep Neural Networks Using Test-Time Dropout , 2019, J. Chem. Inf. Model..

[147]  Dmitry Vetrov,et al.  Entangled Conditional Adversarial Autoencoder for de Novo Drug Discovery. , 2018, Molecular pharmaceutics.

[148]  A. Fliri,et al.  Biospectra analysis: model proteome characterizations for linking molecular structure and biological response. , 2005, Journal of medicinal chemistry.

[149]  Brendan J. Frey,et al.  Classifying and segmenting microscopy images with deep multiple instance learning , 2015, Bioinform..

[150]  Anne E Carpenter,et al.  Applying Faster R-CNN for Object Detection on Malaria Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[151]  Lars Carlsson,et al.  ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics , 2017, Journal of Cheminformatics.

[152]  F. Iorio,et al.  Transcriptional data: a new gateway to drug repositioning? , 2013, Drug discovery today.

[153]  Anne E Carpenter,et al.  A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay , 2017, GigaScience.

[154]  R. Hertzberg,et al.  High-throughput screening: new technology for the 21st century. , 2000, Current opinion in chemical biology.

[155]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[156]  Andreas Verras,et al.  Is Multitask Deep Learning Practical for Pharma? , 2017, J. Chem. Inf. Model..

[157]  Hinrich W. H. Göhlmann,et al.  Transcriptional Characterization of Compounds: Lessons Learned from the Public LINCS Data. , 2016, Assay and drug development technologies.

[158]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[159]  Zhangxian,et al.  Linking phenotypes and modes of action through high-content screen fingerprints. , 2015 .

[160]  Leopold Parts,et al.  Accurate Classification of Protein Subcellular Localization from High-Throughput Microscopy Images Using Deep Learning , 2016, G3: Genes, Genomes, Genetics.

[161]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[162]  Regina Barzilay,et al.  Prediction of Organic Reaction Outcomes Using Machine Learning , 2017, ACS central science.

[163]  Lorenz M Mayr,et al.  Novel trends in high-throughput screening. , 2009, Current opinion in pharmacology.

[164]  Xiaomin Luo,et al.  Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. , 2020, Journal of medicinal chemistry.

[165]  Jianfeng Pei,et al.  Deep learning for molecular generation. , 2019, Future medicinal chemistry.

[166]  Alán Aspuru-Guzik,et al.  Deep learning enables rapid identification of potent DDR1 kinase inhibitors , 2019, Nature Biotechnology.

[167]  Bernd Beck,et al.  The impact of data integrity on decision making in early lead discovery , 2015, Journal of Computer-Aided Molecular Design.

[168]  J. Reymond The chemical space project. , 2015, Accounts of chemical research.

[169]  Thierry Kogej,et al.  Big pharma screening collections: more of the same or unique libraries? The AstraZeneca-Bayer Pharma AG case. , 2013, Drug discovery today.

[170]  Anne Mai Wassermann,et al.  Dark chemical matter as a promising starting point for drug lead discovery. , 2015, Nature chemical biology.

[171]  S. Joshua Swamidass,et al.  Modeling Reactivity to Biological Macromolecules with a Deep Multitask Network , 2016, ACS central science.

[172]  A. Fliri,et al.  Biological spectra analysis: Linking biological activity profiles to molecular structure. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[173]  Peter S. Kutchukian,et al.  Rethinking molecular similarity: comparing compounds on the basis of biological activity. , 2012, ACS chemical biology.

[174]  F. Collins,et al.  How is the Human Genome Project doing, and what have we learned so far? , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[175]  Sebastian Stüker An automatic system for the simultaneous translation of lectures , 2014, Journal of Cheminformatics.

[176]  Marwin H. S. Segler,et al.  Modelling Chemical Reasoning to Predict Reactions , 2016, Chemistry.

[177]  Gisbert Schneider,et al.  Designing Anticancer Peptides by Constructive Machine Learning , 2018, ChemMedChem.

[178]  D. Bojanic,et al.  Impact of high-throughput screening in biomedical research , 2011, Nature Reviews Drug Discovery.

[179]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[180]  Beate Sick,et al.  Single-Cell Phenotype Classification Using Deep Convolutional Neural Networks , 2016, Journal of biomolecular screening.

[181]  Stuart G Nicholls,et al.  The Human Genome Project, and recent advances in personalized genomics , 2015, Risk management and healthcare policy.