Proteochemometrics - recent developments in bioactivity and selectivity modeling.

Proteochemometrics is a machine learning based modeling approach relying on a combination of ligand and protein descriptors. With ongoing developments in machine learning and increases in public data the technique is more frequently applied in early drug discovery, typically in ligand-target binding prediction. Common applications include improvements to single target quantitative structure-activity relationship models, protein selectivity and promiscuity modeling, and large-scale deep learning approaches. The increase in predictive power using proteochemometrics is observed in multi-target bioactivity modeling, opening the door to more extensive studies covering whole protein families. On top of that, with deep learning fueling more complex and larger scale models, proteochemometrics allows faster and higher quality computational models supporting the design, make, test cycle.

[1]  Arzucan Özgür,et al.  DeepDTA: deep drug–target binding affinity prediction , 2018, Bioinform..

[2]  B. Rasti,et al.  Proteochemometric modeling of the origin of thymidylate synthase inhibition , 2018, Chemical biology & drug design.

[3]  A. Bender,et al.  Prediction of PARP Inhibition with Proteochemometric Modelling and Conformal Prediction , 2015, Molecular informatics.

[4]  Jahan B. Ghasemi,et al.  Probing the origin of dihydrofolate reductase inhibition via proteochemometric modeling , 2018, Journal of Chemometrics.

[5]  B. Rasti,et al.  Structural insights into the origin of phosphoinositide 3-kinase inhibition , 2020, Structural Chemistry.

[6]  David Vidal,et al.  Evaluation of Cross-Validation Strategies in Sequence-Based Binding Prediction Using Deep Learning , 2019, J. Chem. Inf. Model..

[7]  Isidro Cortes-Ciriano,et al.  Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects , 2015 .

[8]  Jia Zhang,et al.  Computational drug repositioning using collaborative filtering via multi-source fusion , 2017, Expert Syst. Appl..

[9]  Knut Baumann,et al.  Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation , 2014, Journal of Cheminformatics.

[10]  George Papadatos,et al.  Unprecedently Large-Scale Kinase Inhibitor Set Enabling the Accurate Prediction of Compound–Kinase Activities: A Way toward Selective Promiscuity by Design? , 2016, J. Chem. Inf. Model..

[11]  Anne E Carpenter,et al.  Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery. , 2018, Cell chemical biology.

[12]  Isidro Cortes-Ciriano,et al.  Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel , 2015, Bioinform..

[13]  Jahan B. Ghasemi,et al.  Quantitative Characterization of the Interaction Space of the Mammalian Carbonic Anhydrase Isoforms I, II, VII, IX, XII, and XIV and their Inhibitors, Using the Proteochemometric Approach , 2016, Chemical biology & drug design.

[14]  Gianni De Fabritiis,et al.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks , 2018, J. Chem. Inf. Model..

[15]  Peteris Prusis,et al.  Improved approach for proteochemometrics modeling: application to organic compound - amine G protein-coupled receptor interactions , 2005, Bioinform..

[16]  Hao Ding,et al.  Collaborative matrix factorization with multiple similarities for predicting drug-target interactions , 2013, KDD.

[17]  Zachary Wu,et al.  Learned protein embeddings for machine learning , 2018, Bioinformatics.

[18]  Noel Southall,et al.  Novel Consensus Architecture To Improve Performance of Large-Scale Multitask Deep Learning QSAR Models , 2019, J. Chem. Inf. Model..

[19]  Isidro Cortes-Ciriano,et al.  Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets , 2013, Journal of Cheminformatics.

[20]  Volkan Atalay,et al.  DEEPScreen: high performance drug–target interaction prediction with convolutional neural networks using 2-D structural compound representations† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc03414e , 2020, Chemical science.

[21]  Maria Liakata,et al.  Merits of random forests emerge in evaluation of chemometric classifiers by external validation. , 2013, Analytica chimica acta.

[22]  Gerard J. P. van Westen,et al.  Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets , 2013, Journal of Cheminformatics.

[23]  Gerard J. P. van Westen,et al.  Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targets , 2011 .

[24]  A. Linusson,et al.  Quantitative protein descriptors for secondary structure characterization and protein classification , 2009 .

[25]  M. A. Sarram,et al.  A novel proteochemometrics model for predicting the inhibition of nine carbonic anhydrase isoforms based on supervised Laplacian score and k-nearest neighbour regression , 2018, SAR and QSAR in environmental research.

[26]  Julian E. Fuchs,et al.  3D proteochemometrics: using three-dimensional information of proteins and ligands to address aspects of the selectivity of serine proteases. , 2017, MedChemComm.

[27]  Sergey Plis,et al.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data. , 2016, Molecular pharmaceutics.

[28]  Isidro Cortes-Ciriano,et al.  Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling , 2015, Journal of Cheminformatics.

[29]  P. Garg,et al.  An improved approach for predicting drug-target interaction: proteochemometrics to molecular docking. , 2016, Molecular bioSystems.

[30]  Nanda Ghoshal,et al.  Target specific proteochemometric model development for BACE1 - protein flexibility and structural water are critical in virtual screening. , 2015, Molecular bioSystems.

[31]  Yonghua Wang,et al.  Pred-binding: large-scale protein–ligand binding affinity prediction , 2016, Journal of enzyme inhibition and medicinal chemistry.

[32]  George Papadatos,et al.  Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set , 2017, bioRxiv.

[33]  Jarl E. S. Wikberg,et al.  Kinome-wide interaction modelling using alignment-based and alignment-independent approaches for kinase description and linear and non-linear data analysis techniques , 2010, BMC Bioinformatics.

[34]  Isidro Cortes-Ciriano,et al.  Proteochemometric modelling coupled to in silico target prediction: an integrated approach for the simultaneous prediction of polypharmacology and binding affinity/potency of small molecules , 2015, Journal of Cheminformatics.

[35]  Dingfeng Wu,et al.  Finding the molecular scaffold of nuclear receptor inhibitors through high-throughput screening based on proteochemometric modelling , 2018, Journal of Cheminformatics.

[36]  Sabrina Jaeger,et al.  Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition , 2018, J. Chem. Inf. Model..

[37]  Simone Fulle,et al.  Kinome‐Wide Profiling Prediction of Small Molecules , 2018, ChemMedChem.

[38]  Benoit Playe,et al.  Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity , 2020, Journal of Cheminformatics.

[39]  John P. Overington,et al.  Identification of Allosteric Modulators of Metabotropic Glutamate 7 Receptor Using Proteochemometric Modeling , 2017, J. Chem. Inf. Model..

[40]  Gerard J. P. van Westen,et al.  Identification of novel small molecule inhibitors for solute carrier SGLT1 using proteochemometric modeling , 2019, Journal of Cheminformatics.

[41]  Irini Doytchinova,et al.  Proteochemometrics-Based Prediction of Peptide Binding to HLA-DP Proteins , 2017, J. Chem. Inf. Model..

[42]  I. Doytchinova,et al.  Peptide Binding Prediction to Five Most Frequent HLA‐DQ Proteins – a Proteochemometric Approach , 2015, Molecular informatics.

[43]  Zhiwei Cao,et al.  Study on human GPCR-inhibitor interactions by proteochemometric modeling. , 2013, Gene.

[44]  T. Lundstedt,et al.  Development of proteo-chemometrics: a novel technology for the analysis of drug-receptor interactions. , 2001, Biochimica et biophysica acta.

[45]  Shaoming Yang,et al.  Applying chemometrics approaches to model and predict the binding affinities between the human amphiphysin SH3 domain and its peptide ligands. , 2010, Protein and peptide letters.

[46]  C. Nantasenamat,et al.  Exploring the origin of phosphodiesterase inhibition via proteochemometric modeling , 2017 .

[47]  Andreas Bender,et al.  Prospectively Validated Proteochemometric Models for the Prediction of Small-Molecule Binding to Bromodomain Proteins , 2018, J. Chem. Inf. Model..

[48]  O. Spjuth,et al.  Origin of aromatase inhibitory activity via proteochemometric modeling , 2016, PeerJ.

[49]  Hyunwhan Joe,et al.  Multi-channel PINN: investigating scalable and transferable neural networks for drug discovery , 2019, Journal of Cheminformatics.

[50]  G. Schneider,et al.  Active learning for computational chemogenomics. , 2017, Future medicinal chemistry.

[51]  George M. Church,et al.  Unified rational protein engineering with sequence-based deep representation learning , 2019, Nature Methods.