Drug design by machine-trained elastic networks: predicting Ser/Thr-protein kinase inhibitors’ activities

An elastic network model (ENM) represents a molecule as a matrix of pairwise atomic interactions. Rich in coded information, ENMs are hereby proposed as a novel tool for the prediction of the activity of series of molecules, with widely different chemical structures , but a common biological activity. The new approach is developed and tested using a set of 183 inhibitors of serine/threonine-protein kinase enzyme (Plk3) which is an enzyme implicated in the regulation of cell cycle and tumorigenesis. The elastic network (EN) predictive model is found to exhibit high accuracy and speed compared to descriptor-based machine-trained modeling. EN modeling appears to be a highly promising new tool for the high demands of industrial applications such as drug and material design. Graphic abstract

[1]  Miriam Seoane Santos,et al.  Cross-Validation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches [Research Frontier] , 2018, IEEE Computational Intelligence Magazine.

[2]  D. Hoekman Exploring QSAR Fundamentals and Applications in Chemistry and Biology, Volume 1. Hydrophobic, Electronic and Steric Constants, Volume 2 J. Am. Chem. Soc. 1995, 117, 9782 , 1996 .

[3]  V. Rastija,et al.  Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model , 2014, Medicinal Chemistry Research.

[4]  Models of Life , 2017 .

[6]  Shuichi Shinmura New Theory of Discriminant Analysis After R. Fisher , 2016 .

[7]  José Antonio Lozano,et al.  Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Chérif F. Matta,et al.  Modeling Biophysical and Biological Properties From the Characteristics of the Molecular Electron Density, Electron Localization and Delocalization Matrices, and the Electrostatic Potential , 2014, J. Comput. Chem..

[9]  Bor-Wen Cheng,et al.  Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods , 2012, Journal of Medical Systems.

[10]  K. Strebhardt,et al.  The role of Plk3 in oncogenesis , 2016, Oncogene.

[11]  David J. Livingstone,et al.  The Use of Artificial Neural Networks in QSAR , 1992 .

[12]  Y. Zhao,et al.  Comparison of decision tree methods for finding active objects , 2007, 0708.4274.

[13]  Fionn Murtagh,et al.  Multilayer perceptrons for classification and regression , 1991, Neurocomputing.

[14]  J. J. van Dixhoorn,et al.  Physical structure in systems theory : network approaches to engineering and economics , 1974 .

[15]  Michael K. Gilson,et al.  BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology , 2015, Nucleic Acids Res..

[16]  Vijisha K. Rajan,et al.  QSAR classification-based virtual screening followed by molecular docking studies for identification of potential inhibitors of 5-lipoxygenase , 2018, Comput. Biol. Chem..

[17]  Manuela Pavan,et al.  DRAGON SOFTWARE: AN EASY APPROACH TO MOLECULAR DESCRIPTOR CALCULATIONS , 2006 .

[18]  Tirion,et al.  Large Amplitude Elastic Motions in Proteins from a Single-Parameter, Atomic Analysis. , 1996, Physical review letters.

[19]  Yasset Perez-Riverol,et al.  Accurate and fast feature selection workflow for high-dimensional omics data , 2017, bioRxiv.

[21]  Yali Wang,et al.  Predicting the biological activities of triazole derivatives as SGLT2 inhibitors using multilayer perceptron neural network, support vector machine, and projection pursuit regression models , 2016 .

[22]  Samuel L. C. Moors,et al.  The harmonic analysis of cylindrically symmetric proteins: a comparison of Dronpa and a DNA sliding clamp. , 2012, Journal of molecular graphics & modelling.

[23]  A. Giuliani,et al.  Comparative Study of Elastic Network Model and Protein Contact Network for Protein Complexes: The Hemoglobin Case , 2017, BioMed research international.

[24]  Hassan A. Karimi,et al.  High-throughput modeling and analysis of protein structural dynamics , 2007, Briefings Bioinform..

[25]  Wai-Ki Ching,et al.  Drug Side-Effect Profiles Prediction: From Empirical to Structural Risk Minimization , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[26]  Guixia Liu,et al.  Structure-based ensemble-QSAR model: a novel approach to the study of the EGFR tyrosine kinase and its inhibitors , 2013, Acta Pharmacologica Sinica.

[27]  Jessica A. Wignall,et al.  Conditional Toxicity Value (CTV) Predictor: An In Silico Approach for Generating Quantitative Risk Estimates for Chemicals , 2018, Environmental health perspectives.

[28]  Daniel Svozil,et al.  Nonpher: computational method for design of hard-to-synthesize structures , 2017, Journal of Cheminformatics.

[29]  David E. Shaw,et al.  PHASE: a new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results , 2006, J. Comput. Aided Mol. Des..

[30]  Bernard F. Buxton,et al.  Drug Design by Machine Learning: Support Vector Machines for Pharmaceutical Data Analysis , 2001, Comput. Chem..

[31]  Woody Sherman,et al.  AutoQSAR: an automated machine learning tool for best-practice quantitative structure-activity relationship modeling. , 2016, Future medicinal chemistry.

[32]  Cyrus Ahmadi Toussi,et al.  Improving protein secondary structure prediction: the evolutionary optimized classification algorithms , 2019, Structural Chemistry.

[33]  Lucy J. Colwell,et al.  The Role of Protein-Ligand Contacts in Allosteric Regulation of the Escherichia coli Catabolite Activator Protein* , 2015, The Journal of Biological Chemistry.

[34]  Gisbert Schneider,et al.  Support vector machine applications in bioinformatics. , 2003, Applied bioinformatics.

[35]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[36]  Aleksandra E. Badaczewska-Dawid,et al.  Modeling of Protein Structural Flexibility and Large-Scale Dynamics: Coarse-Grained Simulations and Elastic Network Models , 2018, International journal of molecular sciences.

[37]  K. Héberger,et al.  Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters† , 2015, SAR and QSAR in environmental research.

[38]  S. Gharaghani,et al.  Constraint score for semi-supervised feature selection in ligand-and receptor-based QSAR on serine/threonine-protein kinase PLK3 inhibitors , 2017 .

[39]  Walter Cedeño,et al.  On the Use of Neural Network Ensembles in QSAR and QSPR , 2002, J. Chem. Inf. Comput. Sci..

[40]  R. Todeschini,et al.  Molecular Descriptors for Chemoinformatics: Volume I: Alphabetical Listing / Volume II: Appendices, References , 2009 .

[41]  Thomas S. Huang,et al.  Network theory : an introductory course , 1971 .

[42]  Yong Pan,et al.  Nano-QSAR modeling for predicting the cytotoxicity of metal oxide nanoparticles using novel descriptors , 2016 .

[43]  C. Matta Molecules as networks: A localization-delocalization matrices approach , 2018 .

[44]  C. Hansch Quantitative approach to biochemical structure-activity relationships , 1969 .

[45]  Guang Hu,et al.  Protein Structure Network-based Drug Design. , 2016, Mini reviews in medicinal chemistry.

[46]  Thomas Lengauer,et al.  On the Applicability of Elastic Network Normal Modes in Small-Molecule Docking , 2012, J. Chem. Inf. Model..

[47]  Joe G Greener,et al.  Structure-based prediction of protein allostery. , 2018, Current opinion in structural biology.

[48]  Phill-Seung Lee,et al.  Toward Modular Analysis of Supramolecular Protein Assemblies. , 2015, Journal of chemical theory and computation.

[49]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[50]  Sushama Nagpal,et al.  Analysis of Feature Ranking Techniques for Defect Prediction in Software Systems , 2018 .

[51]  J. Su,et al.  Energy transport pathway in proteins: Insights from non-equilibrium molecular dynamics with elastic network model , 2018, Scientific Reports.

[52]  M. Kompany‐Zareh,et al.  Replacement based non-linear data reduction in radial basis function networks QSAR modeling , 2014 .

[53]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[54]  Min Hyeok Kim,et al.  Robust elastic network model: A general modeling for precise understanding of protein dynamics. , 2015, Journal of structural biology.

[55]  R. Brereton,et al.  Support vector machines for classification and regression. , 2010, The Analyst.

[56]  Ignacio Ponzoni,et al.  Comparing Multiobjective Evolutionary Algorithms for Cancer Data Microarray Feature Selection , 2018, 2018 IEEE Congress on Evolutionary Computation (CEC).

[57]  Mohammad Ali Zare Chahooki,et al.  Feature selection based on graph Laplacian by using compounds with known and unknown activities , 2017 .

[58]  R. Czerminski,et al.  Use of Support Vector Machine in Pattern Classification: Application to QSAR Studies , 2001 .

[59]  Suaib Luqman,et al.  Structure-Activity Relationship Studies on Holy Basil (Ocimum sanctum L.) Based Flavonoid Orientin and its Analogue for Cytotoxic Activity in Liver Cancer Cell Line HepG2. , 2016, Combinatorial chemistry & high throughput screening.

[60]  R. Jernigan,et al.  Anisotropy of fluctuation dynamics of proteins with an elastic network model. , 2001, Biophysical journal.

[61]  C. Apte,et al.  Data mining with decision trees and decision rules , 1997, Future Gener. Comput. Syst..

[62]  Y. Sanejouand,et al.  On the relationship between low-frequency normal modes and the large-scale conformational changes of proteins. , 2015, Archives of biochemistry and biophysics.

[63]  M. Sternberg,et al.  Insights into protein flexibility: The relationship between normal modes and conformational change upon protein–protein docking , 2008, Proceedings of the National Academy of Sciences.

[64]  M. Shahlaei Descriptor selection methods in quantitative structure-activity relationship studies: a review study. , 2013, Chemical reviews.

[65]  Guoyi Zhang,et al.  Bias-corrected random forests in regression , 2012 .

[66]  P. Ayers,et al.  Aromaticity of rings-in-molecules (RIMs) from electron localization–delocalization matrices (LDMs) , 2015 .

[67]  Razieh Sheikhpour,et al.  A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities , 2018, Journal of Computer-Aided Molecular Design.

[68]  David A. Winkler,et al.  The role of quantitative structure-activity relationships (QSAR) in biomolecular discovery , 2002, Briefings Bioinform..

[69]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[70]  Steven L Dixon,et al.  PHASE: A Novel Approach to Pharmacophore Modeling and 3D Database Searching , 2006, Chemical biology & drug design.

[71]  P. Ayers,et al.  Electron localization-delocalization matrices in the prediction of pKa's and UV-wavelengths of maximum absorbance of p-benzoic acids and the definition of super-atoms in molecules , 2014 .

[72]  Ronal Cook,et al.  Principal components of localization-delocalization matrices: new descriptors for modeling biological activities of organic compounds. Part I: mosquito insecticides and repellents , 2017, Structural Chemistry.

[73]  Cyrus Ahmadi Toussi,et al.  A better prediction of conformational changes of proteins using minimally connected network models. , 2017, Physical biology.