Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments

Three‐dimensional (3D) protein structures now frequently lack functional annotations because of the increase in the rate at which chemical structures are solved with respect to experimental knowledge of biological activity. As a result, predicting structure‐function relationships for proteins is an active research field in computational chemistry and has implications in medicinal chemistry, biochemistry and proteomics. In previous studies stochastic spectral moments were used to predict protein stability or function (González‐Díaz, H. et al. Bioorg Med Chem 2005, 13, 323; Biopolymers 2005, 77, 296). Nevertheless, these moments take into consideration only electrostatic interactions and ignore other important factors such as van der Waals interactions. The present study introduces a new class of 3D structure molecular descriptors for folded proteins named the stochastic van der Waals spectral moments (oβk). Among many possible applications, recognition of kinases was selected due to the fact that previous computational chemistry studies in this area have not been reported, despite the widespread distribution of kinases. The best linear model found was Kact = −9.44°β0(c) +10.94°β5(c) −2.40°β0(i) + 2.45°β5(m) + 0.73, where core (c), inner (i) and middle (m) refer to specific spatial protein regions. The model with a high Matthew's regression coefficient (0.79) correctly classified 206 out of 230 proteins (89.6%) including both training and predicting series. An area under the ROC curve of 0.94 differentiates our model from a random classifier. A subsequent principal components analysis of 152 heterogeneous proteins demonstrated that βk codifies information different to other descriptors used in protein computational chemistry studies. Finally, the model recognizes 110 out of 125 kinases (88.0%) in a virtual screening experiment and this can be considered as an additional validation study (these proteins were not used in training or predicting series). © 2007 Wiley Periodicals, Inc. J Comput Chem 2007

[1]  B. Celda,et al.  Conformational and structural analysis of the equilibrium between single‐ and double‐strand β‐helix of a D,L‐alternating oligonorleucine , 2004, Biopolymers.

[2]  Francisco Torrens,et al.  Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. , 2004, Molecules.

[3]  P. Dobson,et al.  Predicting enzyme class from protein structure without alignments. , 2005, Journal of molecular biology.

[4]  A. Fersht Structure and mechanism in protein science , 1998 .

[5]  Maykel Pérez González,et al.  A topological sub-structural approach to the mutagenic activity in dental monomers. 2. Cycloaliphatic epoxides , 2004 .

[6]  Maykel Pérez González,et al.  A topological sub-structural approach of the mutagenic activity in dental monomers. 1. Aromatic epoxides , 2004 .

[7]  Humberto González Díaz,et al.  3D-MEDNEs: an alternative "in silico" technique for chemical research in toxicology. 1. prediction of chemically induced agranulocytosis. , 2003, Chemical research in toxicology.

[8]  Lourdes Santana,et al.  Design, synthesis and photobiological properties of 3,4-cyclopentenepsoralens. , 2005, Bioorganic & medicinal chemistry.

[9]  Eugenio Uriarte,et al.  Markovian Backbone Negentropies: Molecular descriptors for protein research. I. Predicting protein stability in Arc repressor mutants , 2004, Proteins.

[10]  L. Nilsson,et al.  On the truncation of long-range electrostatic interactions in DNA. , 2000, Biophysical journal.

[11]  P. Cohen,et al.  Specificity and mechanism of action of some commonly used protein kinase inhibitors , 2000 .

[12]  Humberto González Díaz,et al.  Markovian negentropies in bioinformatics. 1. A picture of footprints after the interaction of the HIV-1 -RNA packaging region with drugs , 2003, Bioinform..

[13]  B G Benson,et al.  Prevention of chemotherapy-induced alopecia in rats by CDK inhibitors. , 2001, Science.

[14]  Han van de Waterbeemd,et al.  Chemometric methods in molecular design , 1995 .

[15]  Richard D. Cramer,et al.  BC(DEF) parameters. 1. The intrinsic dimensionality of intermolecular interactions in the liquid state , 1980 .

[16]  Li-Huei Tsai,et al.  Cdk5, a therapeutic target for Alzheimer's disease? , 2004, Biochimica et biophysica acta.

[17]  P. Cohen,et al.  The specificities of protein kinase inhibitors: an update. , 2003, The Biochemical journal.

[18]  Shannon L. Taylor,et al.  Viral and cellular kinases are potential antiviral targets and have a central role in varicella zoster virus pathogenesis. , 2004, Biochimica et biophysica acta.

[19]  J M Thornton,et al.  Derivation of 3D coordinate templates for searching structural databases: Application to ser‐His‐Asp catalytic triads in the serine proteinases and lipases , 1996, Protein science : a publication of the Protein Society.

[20]  Humberto González Díaz,et al.  Markovian chemicals "in silico" design (MARCH-INSIDE), a promising approach for computer aided molecular design II: experimental and theoretical assessment of a novel method for virtual screening of fasciolicides , 2002, Journal of molecular modeling.

[21]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[22]  H B Broughton,et al.  Molecular modeling. , 2020, Current opinion in chemical biology.

[23]  Maykel Cruz-Monteagudo,et al.  Predicting multiple drugs side effects with a general drug-target interaction thermodynamic Markov model. , 2005, Bioorganic & medicinal chemistry.

[24]  Ernesto Estrada,et al.  A Protein Folding Degree Measure and Its Dependence on Crystal Packing, Protein Size, Secondary Structure, and Domain Structural Class , 2004, J. Chem. Inf. Model..

[25]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[26]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[27]  Yovani Marrero-Ponce,et al.  Non-stochastic and stochastic linear indices of the 'molecular pseudograph's atom adjacency matrix': application to 'in silico' studies for the rational discovery of new antimalarial compounds. , 2005, Bioorganic & medicinal chemistry.

[28]  D. Bossemeyer,et al.  Protein kinases — structure and function , 1995, FEBS letters.

[29]  Ronal Ramos de Armas,et al.  Vibrational Markovian modelling of footprints after the interaction of antibiotics with the packaging region of HIV type 1 , 2003, Bulletin of mathematical biology.

[30]  K D Watenpaugh,et al.  A model of the complex between cyclin-dependent kinase 5 and the activation domain of neuronal Cdk5 activator. , 1999, Biochemical and biophysical research communications.

[31]  G J Kleywegt,et al.  Validation of protein models from Calpha coordinates alone. , 1997, Journal of molecular biology.

[32]  K C Chou,et al.  Prediction of tight turns and their types in proteins. , 2000, Analytical biochemistry.

[33]  Roberto Todeschini,et al.  Handbook of Molecular Descriptors , 2002 .

[34]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[35]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[36]  Humberto González-Díaz,et al.  Recognition of stable protein mutants with 3D stochastic average electrostatic potentials , 2005, FEBS letters.

[37]  Ramón García-Domenech,et al.  Antimicrobial Activity Characterization in a Heterogeneous Group of Compounds , 1998, J. Chem. Inf. Comput. Sci..

[38]  N Srinivasan,et al.  A genomic perspective of protein kinases in Plasmodium falciparum , 2004, Proteins.

[39]  Ernesto Estrada,et al.  Characterization of the folding degree of proteins , 2002, Bioinform..

[40]  Humberto González Díaz,et al.  Stochastic molecular descriptors for polymers. 1. Modelling the properties of icosahedral viruses with 3D-Markovian negentropies , 2004 .

[41]  Ernesto Estrada,et al.  Effect of protein backbone folding on the stability of protein-ligand complexes. , 2006, Journal of proteome research.

[42]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[43]  K. Chou Prediction of human immunodeficiency virus protease cleavage sites in proteins. , 1996, Analytical biochemistry.

[44]  Rafael Molina,et al.  Stochastic molecular descriptors for polymers. 2. Spherical truncation of electrostatic interactions , 2005 .

[45]  Lourdes Santana,et al.  3D QSAR Markov model for drug-induced eosinophilia--theoretical prediction and preliminary experimental assay of the antimicrobial drug G1. , 2005, Bioorganic & medicinal chemistry.

[46]  P. Póvoa,et al.  C-reactive protein as a marker of infection in critically ill patients. , 2005, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[47]  Miguel A. Cabrera,et al.  Unified Markov thermodynamics based on stochastic forms to classify drugs considering molecular structure, partition system, and biological species: distribution of the antimicrobial G1 on rat tissues. , 2005, Bioorganic & medicinal chemistry letters.

[48]  Edmund R. Malinowski,et al.  Factor Analysis in Chemistry , 1980 .

[49]  Gennady M Verkhivker Protein conformational transitions coupled to binding in molecular recognition of unstructured proteins: Deciphering the effect of intermolecular interactions on computational structure prediction of the p27Kip1 protein bound to the cyclin A–cyclin‐dependent kinase 2 complex , 2004, Proteins.

[50]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[51]  Maykel Pérez González,et al.  A topological sub-structural approach to the mutagenic activity in dental monomers. 3. Heterogeneous set of compounds , 2005 .

[52]  P. Traxler,et al.  ATP site‐directed competitive and irreversible inhibitors of protein kinases , 2000, Medicinal research reviews.

[53]  Lennart Nilsson,et al.  Advances in biomolecular simulations: methodology and recent applications , 2003, Quarterly Reviews of Biophysics.

[54]  Khai Pang Leong,et al.  Tyrosine kinase inhibitors: a new approach for asthma. , 2004, Biochimica et biophysica acta.

[55]  L. Schang,et al.  Effects of pharmacological cyclin-dependent kinase inhibitors on viral transcription and replication. , 2004, Biochimica et biophysica acta.

[56]  K. Chou,et al.  Studies on the specificity of HIV protease: An application of Markov chain theory , 1993, Journal of protein chemistry.

[57]  Maykel Pérez González,et al.  TOPS-MODE approach to predict mutagenicity in dental monomers , 2004 .

[58]  Daniel Monleón,et al.  Study of electrostatic potential surface distribution of wild-type plastocyanin Synechocystis solution structure determined by homonuclear NMR. , 2003, Biopolymers.

[59]  Yovani Marrero-Ponce,et al.  Ligand-Based Virtual Screening and in Silico Design of New Antimalarial Compounds Using Nonstochastic and Stochastic Total and Atom-Type Quadratic Maps , 2005, J. Chem. Inf. Model..

[60]  Humberto González-Díaz,et al.  Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments , 2005, Biopolymers.

[61]  Ivan Gutman,et al.  Spectral moments of polymer graphs , 1996 .

[62]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[63]  Han van de Waterbeemd,et al.  Chemometric Methods in Molecular Design: van de Waterbeemd/Chemometric , 1995 .

[64]  Lourdes Santana,et al.  A QSAR model for in silico screening of MAO-A inhibitors. Prediction, synthesis, and biological assay of novel coumarins. , 2006, Journal of medicinal chemistry.

[65]  Zheng Yuan Prediction of protein subcellular locations using Markov chain models , 1999, FEBS letters.

[66]  Eugenio Uriarte,et al.  Stochastic-based descriptors studying peptides biological properties: modeling the bitter tasting threshold of dipeptides. , 2004, Bioorganic & medicinal chemistry.

[67]  F M Richards,et al.  Protein packing: dependence on protein size, secondary structure and amino acid composition. , 2000, Journal of molecular biology.

[68]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[69]  J. Burdett,et al.  Moments method and elemental structures , 1985 .

[70]  Humberto González-Díaz,et al.  Novel 2D maps and coupling numbers for protein sequences. The first QSAR study of polygalacturonases; isolation and prediction of a novel sequence from Psidium guajava L. , 2006, FEBS letters.

[71]  Yu-Dong Cai,et al.  Prediction of protein function in the absence of significant sequence similarity. , 2004, Current medicinal chemistry.

[72]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[73]  Milan Randic,et al.  Resolution of ambiguities in structure-property studies by use of orthogonal descriptors , 1991, J. Chem. Inf. Comput. Sci..

[74]  Julie C. Mitchell,et al.  Charge and hydrophobicity patterning along the sequence predicts the folding mechanism and aggregation of proteins: a computational approach. , 2004, Journal of proteome research.

[75]  Humberto González Díaz,et al.  Symmetry considerations in Markovian chemicals 'in silico' design (MARCH-INSIDE) I: central chirality codification, classification of ACE inhibitors and prediction of \sigma-receptor antagonist activities , 2003, Comput. Biol. Chem..

[76]  Humberto González-Díaz,et al.  Markov entropy backbone electrostatic descriptors for predicting proteins biological activity. , 2004, Bioorganic & medicinal chemistry letters.

[77]  Kuo-Chen Chou,et al.  Prediction of protein signal sequences. , 2002, Current protein & peptide science.

[78]  Francisco Torrens,et al.  Protein linear indices of the 'macromolecular pseudograph alpha-carbon atom adjacency matrix' in bioinformatics. Part 1: prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor. , 2005, Bioorganic & medicinal chemistry.

[79]  Stephen Lee,et al.  Second-moment scaling and covalent crystal structures , 1991 .

[80]  Humberto González-Díaz,et al.  Predicting stability of Arc repressor mutants with protein stochastic moments. , 2005, Bioorganic & medicinal chemistry.

[81]  Milan Randic,et al.  Orthogonal molecular descriptors , 1991 .

[82]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[83]  Milan Randić,et al.  Correlation of enthalphy of octanes with orthogonal connectivity indices , 1991 .

[84]  Humberto González-Díaz,et al.  Proteins Markovian 3D-QSAR with spherically-truncated average electrostatic potentials. , 2005, Bioorganic & medicinal chemistry.

[85]  Laurent Meijer,et al.  Plasmodium falciparum glycogen synthase kinase-3: molecular model, expression, intracellular localisation and selective inhibitors. , 2004, Biochimica et biophysica acta.

[86]  Ernesto Estrada Characterization of the amino acid contribution to the folding degree of proteins , 2004, Proteins.

[87]  B. Celda,et al.  Solution structure of a D,L-alternating oligonorleucine as a model of double-stranded antiparallel beta-helix. , 2002, Biopolymers.

[88]  M. Vieth,et al.  Kinomics-structural biology and chemogenomics of kinase inhibitors and targets. , 2004, Biochimica et biophysica acta.

[89]  Piotr Setny,et al.  Refinement of X‐ray data on dual cosubstrate specificity of CK2 kinase by free energy calculations based on molecular dynamics simulation , 2004, Proteins.

[90]  Janet M Thornton,et al.  Prediction of protein function from structure: insights from methods for the detection of local structural similarities. , 2005, BioTechniques.

[91]  Kuo-Chen Chou,et al.  Identification of the N‐terminal functional domains of Cdk5 by molecular truncation and computer modeling , 2002, Proteins.