Prediction of drug-protein interaction and drug repositioning using machine learning model

Background Traditional drug development is time-consuming and expensive, while computer-aided drug repositioning can improve efficiency and productivity. In this study, we proposed a machine learning pipeline to predict the binding interaction between proteins and marketed or studied drugs. We then extended the predicted interactions to construct a protein network that could be applied to discover the potentially shared drugs between proteins and thus predict drug repositioning. Methods Binding information between proteins and drugs from the Binding Database and the physicochemical properties of drugs from the ChEMBL database were used to build the machine learning models, i.e. support vector regression. We further measured proportionalities between proteins by the predicted binding affinity and introduced edge betweenness centrality to construct a protein similarity network for drug repositioning. Results As the proof of concept, we demonstrated our machine learning approach is capable of reflecting the binding strength between drugs and the target protein. When comparing coefficients of protein models, we found proteins SYUA and TAU that may share common ligand which were not in our training data. Using the edge betweenness centrality network based on the prediction proportionality of protein models, we found a potential target, AK1C2, of aspirin and of which the binding interaction had been validated. Conclusions Our study could not only be applied to drug repositioning by comparing protein models or searching the protein-protein network, but also to predict the binding strength once the sufficient experimental data was provided to train the protein models.

[1]  A. Lockhart Imaging Alzheimer's disease pathology: one target, many ligands. , 2006, Drug discovery today.

[2]  Hojung Nam,et al.  Drug repositioning of herbal compounds via a machine-learning approach , 2019, BMC Bioinformatics.

[3]  Di Wu,et al.  DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks , 2018, bioRxiv.

[4]  J. Mitchell,et al.  Cyclooxygenase-3 (COX-3): Filling in the gaps toward a COX continuum? , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Haldun Akoglu,et al.  User's guide to correlation coefficients , 2018, Turkish journal of emergency medicine.

[6]  D. Cussac,et al.  Agonist-directed trafficking of signalling at serotonin 5-HT2A, 5-HT2B and 5-HT2C-VSV receptors mediated Gq/11 activation and calcium mobilisation in CHO cells. , 2008, European journal of pharmacology.

[7]  M. Kirschner,et al.  A protein factor essential for microtubule assembly. , 1975, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Xi Chen,et al.  The Binding Database: data management and interface design , 2002, Bioinform..

[9]  E. S. Pearson,et al.  TESTS FOR RANK CORRELATION COEFFICIENTS. I , 1957 .

[10]  V. A. Villar,et al.  G Protein-coupled Receptor Kinase 4 (GRK4) Regulates the Phosphorylation and Function of the Dopamine D3 Receptor* , 2009, The Journal of Biological Chemistry.

[11]  Michael E. Burczynski,et al.  Human 3α-hydroxysteroid dehydrogenase isoforms (AKR1C1–AKR1C4) of the aldo-keto reductase superfamily: functional plasticity and tissue distribution reveals roles in the inactivation and formation of male and female sex hormones , 2000 .

[12]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[13]  Wei Zhao,et al.  Overexpressed D2 Dopamine Receptor Inhibits Non-Small Cell Lung Cancer Progression through Inhibiting NF-κB Signaling Pathway , 2018, Cellular Physiology and Biochemistry.

[14]  F. Stanczyk,et al.  Selective reduction of AKR1C2 in prostate cancer and its role in DHT metabolism , 2003, The Prostate.

[15]  K. Alfarouk,et al.  Drug Development: Stages of Drug Development , 2015 .

[16]  Thomas Thum,et al.  Circulating microRNAs for predicting and monitoring response to mechanical circulatory support from a left ventricular assist device , 2014, European journal of heart failure.

[17]  R. W. Hansen,et al.  Journal of Health Economics , 2016 .

[18]  Xin Wen,et al.  BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities , 2006, Nucleic Acids Res..

[19]  J. Krasner,et al.  Drug-protein interaction. , 1972, Pediatric clinics of North America.

[20]  X Chen,et al.  BindingDB: a web-accessible molecular recognition database. , 2001, Combinatorial chemistry & high throughput screening.

[21]  David R. Lovell,et al.  propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis , 2017, Scientific Reports.

[22]  Alok Sharma,et al.  An integrative machine learning approach for prediction of toxicity-related drug safety , 2018, Life Science Alliance.

[23]  Andrew Lockhart,et al.  In vitro high affinity α-synuclein binding sites for the amyloid imaging agent PIB are not matched by binding to Lewy bodies in postmortem human brain , 2008, Journal of neurochemistry.

[24]  Regulation of human MAPT gene expression , 2015, Molecular Neurodegeneration.

[25]  Jürg Bähler,et al.  Proportionality: A Valid Alternative to Correlation for Relative Data , 2014, bioRxiv.

[26]  C. Patrono,et al.  Aspirin and Cancer. , 2016, Journal of the American College of Cardiology.

[27]  J. Dorszewska,et al.  [Alpha-synuclein in Parkinson's disease]. , 2014, Przeglad lekarski.

[28]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[29]  T. Penning,et al.  Type 5 17beta-hydroxysteroid dehydrogenase/prostaglandin F synthase (AKR1C3): role in breast cancer and inhibition by non-steroidal anti-inflammatory drug analogs. , 2009, Chemico-biological interactions.

[30]  Howard A. Fine,et al.  Predicting in vitro drug sensitivity using Random Forests , 2011, Bioinform..

[31]  John P. Overington,et al.  ChEMBL: a large-scale bioactivity database for drug discovery , 2011, Nucleic Acids Res..

[32]  Prasad Kulkarni,et al.  How Drugs are Developed and Approved by the FDA: Current Process and Future Directions , 2014, The American Journal of Gastroenterology.

[33]  W. Greenlee,et al.  3H-[1,2,4]-Triazolo[5,1-i]purin-5-amine derivatives as adenosine A2A antagonists. , 2007, Bioorganic & medicinal chemistry letters.

[34]  Yan Zhao,et al.  Drug repositioning: a machine-learning approach through data integration , 2013, Journal of Cheminformatics.

[35]  F. Stanczyk,et al.  Selective Loss of AKR1C1 and AKR1C2 in Breast Cancer and Their Potential Effect on Progesterone Signaling , 2004, Cancer Research.

[36]  S. Sealfon,et al.  Functional crosstalk and heteromerization of serotonin 5-HT2A and dopamine D2 receptors , 2011, Neuropharmacology.

[37]  N. Sattar,et al.  Effect of aspirin on vascular and nonvascular outcomes: meta-analysis of randomized controlled trials. , 2012, Archives of internal medicine.

[38]  F. van Huizen,et al.  Genomic organization, coding sequence and functional expression of human 5-HT2 and 5-HT1A receptor genes. , 1992, European journal of pharmacology.

[39]  George Papadatos,et al.  The ChEMBL database in 2017 , 2016, Nucleic Acids Res..

[40]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[41]  P. Selzer,et al.  Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. , 2000, Journal of medicinal chemistry.

[42]  Ming-Chih Chou,et al.  Overexpression of aldo-keto reductase 1C2 as a high-risk factor in bladder cancer. , 2007, Oncology reports.

[43]  Shivani Agarwal,et al.  Ranking Chemical Structures for Drug Discovery: A New Machine Learning Approach , 2010, J. Chem. Inf. Model..

[44]  X Chen,et al.  The binding database: overview and user's guide. , 2001, Biopolymers.

[45]  P. Soucy,et al.  Characteristics of a Highly Labile Human Type 5 17β-Hydroxysteroid Dehydrogenase1. , 1999, Endocrinology.

[46]  Cédric Notredame,et al.  How should we measure proportionality on relative gene expression data? , 2016, Theory in Biosciences.

[47]  W. Seider,et al.  Computation of phase and chemical equilibrium: Part I. Local and constrained minima in Gibbs free energy , 1979 .

[48]  Yun Xie,et al.  Predicting the binding affinities of compound–protein interactions by random forest using network topology features , 2018 .

[49]  Jan Hauke,et al.  Comparison of Values of Pearson's and Spearman's Correlation Coefficients on the Same Sets of Data , 2011 .

[50]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[51]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[52]  Dong-Mei Wu,et al.  Inhibition of microRNA-200a Upregulates the Expression of Striatal Dopamine Receptor D2 to Repress Apoptosis of Striatum via the cAMP/PKA Signaling Pathway in Rats with Parkinson’s Disease , 2018, Cellular Physiology and Biochemistry.

[53]  Malcolm J. McGregor,et al.  Clustering of Large Databases of Compounds: Using the MDL "Keys" as Structural Descriptors , 1997, J. Chem. Inf. Comput. Sci..

[54]  S. Pattingre,et al.  JNK1-mediated phosphorylation of Bcl-2 regulates starvation-induced autophagy. , 2008, Molecular cell.

[55]  A. Osterman,et al.  Bcl-2 and Bcl-XL Regulate Proinflammatory Caspase-1 Activation by Interaction with NALP1 , 2007, Cell.

[56]  L. Stefanis α-Synuclein in Parkinson's disease. , 2012, Cold Spring Harbor perspectives in medicine.

[57]  R. W. Hansen,et al.  The price of innovation: new estimates of drug development costs. , 2003, Journal of health economics.

[58]  H. Pajouhesh,et al.  Medicinal chemical properties of successful central nervous system drugs , 2005, NeuroRX.

[59]  J. Zurdo,et al.  Developability assessment as an early de-risking tool for biopharmaceutical development , 2013 .