The Next Era: Deep Learning in Pharmaceutical Research

Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule’s properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique.

[1]  AdomaviciusGediminas,et al.  Adapting machine learning techniques to censored time-to-event health record data , 2016 .

[2]  Yves Roggo,et al.  Near infrared spectroscopy for counterfeit detection using a large database of pharmaceutical tablets. , 2016, Journal of pharmaceutical and biomedical analysis.

[3]  Ross D King,et al.  Intelligent software for laboratory automation. , 2004, Trends in biotechnology.

[4]  G P Moss,et al.  The application of discriminant analysis and Machine Learning methods as tools to identify and classify compounds with potential as transdermal enhancers. , 2012, European journal of pharmaceutical sciences : official journal of the European Federation for Pharmaceutical Sciences.

[5]  Johan Ulander,et al.  Computational Prediction of Drug Solubility in Fasted Simulated and Aspirated Human Intestinal Fluid , 2014, Pharmaceutical Research.

[6]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[7]  Marlene T. Kim,et al.  Developing Enhanced Blood–Brain Barrier Permeability Models: Integrating External Bio-Assay Data in QSAR Modeling , 2015, Pharmaceutical Research.

[8]  Brendan J. Frey,et al.  Classifying and segmenting microscopy images with deep multiple instance learning , 2015, Bioinform..

[9]  Igor V. Tetko,et al.  Consensus Modeling for HTS Assays Using In silico Descriptors Calculates the Best Balanced Accuracy in Tox21 Challenge , 2016, Front. Environ. Sci..

[10]  Hao Zhu,et al.  Big Data in Chemical Toxicity Research: The Use of High-Throughput Screening Assays To Identify Potential Toxicants , 2014, Chemical research in toxicology.

[11]  Alex M. Clark,et al.  Open Source Bayesian Models. 2. Mining a "Big Dataset" To Create and Validate Models with ChEMBL , 2015, J. Chem. Inf. Model..

[12]  Sean Ekins,et al.  Thermodynamic Proxies to Compensate for Biases in Drug Discovery Methods , 2015, Pharmaceutical Research.

[13]  David L. Buckeridge,et al.  A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data , 2014, J. Am. Medical Informatics Assoc..

[14]  Pierre Baldi,et al.  Deep architectures for protein contact map prediction , 2012, Bioinform..

[15]  Sean Ekins,et al.  Structure-activity relationship for FDA approved drugs as inhibitors of the human sodium taurocholate cotransporting polypeptide (NTCP). , 2013, Molecular pharmaceutics.

[16]  S. Joshua Swamidass,et al.  Modeling Epoxidation of Drug-like Molecules with a Deep Machine Learning Network , 2015, ACS central science.

[17]  Sean Ekins,et al.  Computational Models for Neglected Diseases: Gaps and Opportunities , 2013, Pharmaceutical Research.

[18]  Sean Ekins,et al.  Are Bigger Data Sets Better for Machine Learning? Fusing Single-Point and Dual-Event Dose Response Data for Mycobacterium tuberculosis , 2014, J. Chem. Inf. Model..

[19]  William J. Welsh,et al.  New Predictive Models for Blood–Brain Barrier Permeability of Drug-like Molecules , 2008, Pharmaceutical Research.

[20]  Julie Clark,et al.  Discovery of Novel Antimalarial Compounds Enabled by QSAR-Based Virtual Screening , 2013, J. Chem. Inf. Model..

[21]  Youyong Li,et al.  ADMET evaluation in drug discovery. 12. Development of binary classification models for prediction of hERG potassium channel blockage. , 2012, Molecular pharmaceutics.

[22]  Igor V. Tetko,et al.  Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information , 2011, J. Comput. Aided Mol. Des..

[23]  Gang Fu,et al.  PubChem Substance and Compound databases , 2015, Nucleic Acids Res..

[24]  Evgeny Putin,et al.  Deep biomarkers of human aging: Application of deep neural networks to biomarker development , 2016, Aging.

[25]  Tingjun Hou,et al.  ADME Evaluation in Drug Discovery, 8. The Prediction of Human Intestinal Absorption by a Support Vector Machine , 2007, J. Chem. Inf. Model..

[26]  Gisbert Schneider,et al.  Deep Learning in Drug Discovery , 2016, Molecular informatics.

[27]  Dave Winkler,et al.  Bayesian Regularization of Neural Networks , 2009, Artificial Neural Networks.

[28]  Alex M. Clark,et al.  New target prediction and visualization tools incorporating open source molecular fingerprints for TB Mobile 2.0 , 2014, Journal of Cheminformatics.

[29]  Fei Luo,et al.  Pairwise input neural network for target-ligand interaction prediction , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[30]  Luís M. Silva,et al.  High-Content Analysis of Breast Cancer Using Single-Cell Deep Transfer Learning , 2016, Journal of biomolecular screening.

[31]  Sepp Hochreiter,et al.  Toxicity Prediction using Deep Learning , 2015, ArXiv.

[32]  Sean Ekins,et al.  Computational models for drug inhibition of the human apical sodium-dependent bile acid transporter. , 2009, Molecular pharmaceutics.

[33]  Jianlin Cheng,et al.  An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions , 2015, International journal of molecular sciences.

[34]  Walid Gomaa,et al.  Machine learning in computational docking , 2015, Artif. Intell. Medicine.

[35]  Rasool Fakoor,et al.  Using deep learning to enhance cancer diagnosis and classication , 2013 .

[36]  Antony J. Williams,et al.  Bigger data, collaborative tools and the future of predictive drug discovery , 2014, Journal of Computer-Aided Molecular Design.

[37]  Sean Ekins,et al.  Predicting Mouse Liver Microsomal Stability with “Pruned” Machine Learning Models and Public Data , 2015, Pharmaceutical Research.

[38]  Ata Mahjoubfar,et al.  Deep Learning in Label-free Cell Classification , 2016, Scientific Reports.

[39]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[40]  Sergey Plis,et al.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.

[41]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[42]  Hiroshi Mamitsuka,et al.  In Silico Predictions of Human Skin Permeability using Nonlinear Quantitative Structure–Property Relationship Models , 2015, Pharmaceutical Research.

[43]  Tingjun Hou,et al.  ADME evaluation in drug discovery , 2002, Journal of molecular modeling.

[44]  Jianyang Zeng,et al.  A deep learning framework for modeling structural features of RNA-binding protein targets , 2015, Nucleic acids research.

[45]  Feixiong Cheng,et al.  In silico Prediction of Chemical Ames Mutagenicity , 2012, J. Chem. Inf. Model..

[46]  Alex Zhavoronkov,et al.  Applications of Deep Learning in Biomedicine. , 2016, Molecular pharmaceutics.

[47]  Alexander Tropsha,et al.  Trust, But Verify: On the Importance of Chemical Structure Curation in Cheminformatics and QSAR Modeling Research , 2010, J. Chem. Inf. Model..

[48]  Markus A. Lill,et al.  Combining Structure- and Ligand-Based Approaches to Improve Site of Metabolism Prediction in CYP2C9 Substrates , 2014, Pharmaceutical Research.

[49]  Igor V. Tetko,et al.  Development of Dimethyl Sulfoxide Solubility Models Using 163 000 Molecules: Using a Domain Applicability Metric to Select More Reliable Predictions , 2013, J. Chem. Inf. Model..

[50]  Ivan Rusyn,et al.  Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches. , 2011, Chemical research in toxicology.

[51]  Alexander Golbraikh,et al.  Combinatorial QSAR Modeling of P-Glycoprotein Substrates , 2006, J. Chem. Inf. Model..

[52]  Sean Ekins,et al.  Making Transporter Models for Drug–Drug Interaction Prediction Mobile , 2015, Drug Metabolism and Disposition.

[53]  Antony J. Williams,et al.  Collaborative Computational Technologies for Biomedical Research: Ekins/Collaborative Computational , 2011 .

[54]  I. Rubinfeld,et al.  Looking Beyond Historical Patient Outcomes to Improve Clinical Models , 2012, Science Translational Medicine.

[55]  Alexander Tropsha,et al.  Chembench: a cheminformatics workbench , 2010, Bioinform..

[56]  Luhua Lai,et al.  Deep Learning for Drug-Induced Liver Injury , 2015, J. Chem. Inf. Model..

[57]  Tomomi Hatanaka,et al.  In Silico Estimation of Skin Concentration Following the Dermal Exposure to Chemicals , 2015, Pharmaceutical Research.

[58]  Sean Ekins,et al.  Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation. , 2012, Drug discovery today.

[59]  Sean Ekins,et al.  A Predictive Ligand-Based Bayesian Model for Human Drug-Induced Liver Injury , 2010, Drug Metabolism and Disposition.

[60]  Günter Klambauer,et al.  DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..

[61]  Klaus-Robert Müller,et al.  Benchmark Data Set for in Silico Prediction of Ames Mutagenicity , 2009, J. Chem. Inf. Model..

[62]  Hossam M. Zawbaa,et al.  Computational Intelligence Modeling of the Macromolecules Release from PLGA Microspheres—Focus on Feature Selection , 2016, PloS one.

[63]  H. Mewes,et al.  Can we estimate the accuracy of ADME-Tox predictions? , 2006, Drug discovery today.

[64]  J. F. Wang,et al.  Prediction of P-Glycoprotein Substrates by a Support Vector Machine Approach , 2004, J. Chem. Inf. Model..

[65]  Barry A. Bunin,et al.  Chemical Space: Missing Pieces in Cheminformatics , 2010, Pharmaceutical Research.

[66]  Fumiyoshi Yamashita,et al.  Modeling and Prediction of Solvent Effect on Human Skin Permeability using Support Vector Regression and Random Forest , 2015, Pharmaceutical Research.

[67]  Yuichi Motai,et al.  Intra- and Inter-Fractional Variation Prediction of Lung Tumors Using Fuzzy Deep Learning , 2016, IEEE Journal of Translational Engineering in Health and Medicine.

[68]  Barry A. Bunin,et al.  Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery. , 2013, Chemistry & biology.

[69]  Nurit Haspel,et al.  Accurate refinement of docked protein complexes using evolutionary information and deep learning , 2016, J. Bioinform. Comput. Biol..

[70]  B. Rost,et al.  Protein function in precision medicine: deep understanding with machine learning , 2016, FEBS letters.

[71]  Igor V. Tetko,et al.  The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS , 2016, Journal of Cheminformatics.

[72]  C Helma,et al.  Validation of counter propagation neural network models for predictive toxicology according to the OECD principles: a case study , 2006, SAR and QSAR in environmental research.

[73]  Max K Leong,et al.  A novel approach using pharmacophore ensemble/support vector machine (PhE/SVM) for prediction of hERG liability. , 2007, Chemical research in toxicology.

[74]  Sean Ekins,et al.  Pioneering Use of the Cloud for Development of Collaborative Drug Discovery (CDD) Database , 2011 .

[75]  Sean Ekins,et al.  Incentives for Starting Small Companies Focused on Rare and Neglected Diseases , 2015, Pharmaceutical Research.

[76]  Hyunju Lee,et al.  Predicting Drug-Target Interactions Using Drug-Drug Interactions , 2013, PloS one.

[77]  M. Meunier,et al.  Predicting Drug Substances Autoxidation , 2014, Pharmaceutical Research.

[78]  Igor V Tetko,et al.  A renaissance of neural networks in drug discovery , 2016, Expert opinion on drug discovery.

[79]  John B. O. Mitchell Machine learning methods in chemoinformatics , 2014, Wiley interdisciplinary reviews. Computational molecular science.

[80]  Jieping Ye,et al.  Deep convolutional neural networks for annotating gene expression patterns in the mouse brain , 2015, BMC Bioinformatics.

[81]  Gediminas Adomavicius,et al.  Adapting machine learning techniques to censored time-to-event health record data: A general-purpose approach using inverse probability of censoring weighting , 2016, J. Biomed. Informatics.

[82]  Sean Ekins,et al.  Progress in computational toxicology. , 2014, Journal of pharmacological and toxicological methods.

[83]  Alex M. Clark,et al.  Open Source Bayesian Models. 3. Composite Models for Prediction of Binned Responses , 2016, J. Chem. Inf. Model..

[84]  Sean Ekins,et al.  Quantitative structure activity relationship for inhibition of human organic cation/carnitine transporter. , 2010, Molecular pharmaceutics.

[85]  P Chiba,et al.  Future directions for drug transporter modelling , 2007, Xenobiotica; the fate of foreign compounds in biological systems.

[86]  Alex M. Clark,et al.  Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets , 2015, J. Chem. Inf. Model..

[87]  Pierre Baldi,et al.  Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..

[88]  Kwang-Hwi Cho,et al.  Computational classification models for predicting the interaction of compounds with hepatic organic ion importers. , 2015, Drug metabolism and pharmacokinetics.

[89]  Yanli Wang,et al.  Binary Classification of Aqueous Solubility Using Support Vector Machines with Reduction and Recombination Feature Selection , 2011, J. Chem. Inf. Model..

[90]  Robert P. Sheridan,et al.  Deep Neural Nets as a Method for Quantitative Structure-Activity Relationships , 2015, J. Chem. Inf. Model..