Classifying kinase conformations using a machine learning approach

BackgroundSignaling proteins such as protein kinases adopt a diverse array of conformations to respond to regulatory signals in signaling pathways. Perhaps the most fundamental conformational change of a kinase is the transition between active and inactive states, and defining the conformational features associated with kinase activation is critical for selectively targeting abnormally regulated kinases in diseases. While manual examination of crystal structures have led to the identification of key structural features associated with kinase activation, the large number of kinase crystal structures (~3,500) and extensive conformational diversity displayed by the protein kinase superfamily poses unique challenges in fully defining the conformational features associated with kinase activation. Although some computational approaches have been proposed, they are typically based on a small subset of crystal structures using measurements biased towards the active site geometry.ResultsWe utilize an unbiased informatics based machine learning approach to classify all eukaryotic protein kinase conformations deposited in the PDB. We show that the orientation of the activation segment, measured by φ, ψ, χ1, and pseudo-dihedral angles more accurately classify kinase crystal conformations than existing methods. We show that the formation of the K-E salt bridge is statistically dependent upon the activation segment orientation and identify evolutionary differences between the activation segment conformation of tyrosine and serine/threonine kinases. We provide evidence that our method can identify conformational changes associated with the binding of allosteric regulatory proteins, and show that the greatest variation in inactive structures comes from kinase group and family specific side chain orientations.ConclusionWe have provided the first comprehensive machine learning based classification of protein kinase active/inactive conformations, taking into account more structures and measurements than any previous classification effort. Further, our unbiased classification of inactive structures reveals residues associated with kinase functional specificity. To enable classification of new crystal structures, we have made our classifier publicly accessible through a stand-alone program housed at https://github.com/esbg/kinconform [DOI:10.5281/zenodo.249090].

[1]  L. Johnson,et al.  The structural basis for specificity of substrate and recruitment peptides for cyclin-dependent kinases , 1999, Nature Cell Biology.

[2]  Wei‐Chien Huang,et al.  c-Src-dependent Tyrosine Phosphorylation of IKKβ Is Involved in Tumor Necrosis Factor-α-induced Intercellular Adhesion Molecule-1 Expression* , 2003, The Journal of Biological Chemistry.

[3]  D. Cortez,et al.  A role for cdk9-cyclin k in maintaining genome integrity , 2011, Cell cycle.

[4]  Jung Hun Song,et al.  Regulation of protein kinase B tyrosine phosphorylation by thyroid-specific oncogenic RET/PTC kinases. , 2005, Molecular endocrinology.

[5]  T. Hunter,et al.  The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification 1 , 1995, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[6]  B. Riedl,et al.  Design and discovery of small molecules targeting raf-1 kinase. , 2002, Current pharmaceutical design.

[7]  Nguyen-Huu Xuong,et al.  Crystal structure of the catalytic subunit of cAMP-dependent protein kinase complexed with magnesium-ATP and peptide inhibitor , 1993 .

[8]  L. Johnson,et al.  The structural basis for control of eukaryotic protein kinases. , 2012, Annual review of biochemistry.

[9]  Lei Jia,et al.  Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools , 2015, PloS one.

[10]  Kornelia Polyak,et al.  Mechanism of CDK activation revealed by the structure of a cyclinA-CDK2 complex , 1995, Nature.

[11]  Xiaolong Wang,et al.  Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection , 2013, Bioinform..

[12]  R. Abagyan,et al.  Type-II kinase inhibitor docking, screening, and profiling using modified structures of active kinase states. , 2008, Journal of medicinal chemistry.

[13]  S. Pelech,et al.  Regulatory roles of conserved phosphorylation sites in the activation T-loop of the MAP kinase ERK1 , 2016, Molecular biology of the cell.

[14]  Donna Neuberg,et al.  Characterization of AMN107, a selective inhibitor of native and mutant Bcr-Abl. , 2005, Cancer cell.

[15]  B. Rost,et al.  Combining evolutionary information and neural networks to predict protein secondary structure , 1994, Proteins.

[16]  E. Mandelkow,et al.  Glycogen Synthase Kinase (GSK) 3β Directly Phosphorylates Serine 212 in the Regulatory Loop and Inhibits Microtubule Affinity-regulating Kinase (MARK) 2* , 2008, Journal of Biological Chemistry.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Yu-Wei Chang,et al.  An enriched structural kinase database to enable kinome‐wide structure‐based analyses and drug discovery , 2010, Protein science : a publication of the Protein Society.

[19]  Jay H. Chung,et al.  The hCds1 (Chk2)-FHA Domain Is Essential for a Chain of Phosphorylation Events on hCds1 That Is Induced by Ionizing Radiation* , 2001, The Journal of Biological Chemistry.

[20]  P. Jeffrey,et al.  Structural basis of cyclin-dependent kinase activation by phosphorylation , 1996, Nature Structural Biology.

[21]  H. Matter,et al.  Structural classification of protein kinases using 3D molecular interaction field analysis of their ligand binding sites: target family landscapes. , 2002, Journal of medicinal chemistry.

[22]  P. Eyers,et al.  Phosphoregulation of human Mps1 kinase. , 2009, The Biochemical journal.

[23]  David A. Gough,et al.  Predicting protein-protein interactions from primary structure , 2001, Bioinform..

[24]  Jeonghee Cho,et al.  Phosphorylation at Thr-290 regulates Tpl2 binding to NF-kappaB1/p105 and Tpl2 activation and degradation by lipopolysaccharide. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Wagner Meira,et al.  Cutoff Scanning Matrix (CSM): structural classification and function prediction by protein inter-residue distance patterns , 2011, BMC Genomics.

[26]  Junmin Peng,et al.  Cyclin K Functions as a CDK9 Regulatory Subunit and Participates in RNA Polymerase II Transcription* , 1999, The Journal of Biological Chemistry.

[27]  Krys J. Kochut,et al.  ProKinO: A Framework for Protein Kinase Ontology , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.

[28]  H. Möbitz,et al.  The ABC of protein kinase conformations. , 2015, Biochimica et biophysica acta.

[29]  Oliver Beckstein,et al.  MDAnalysis: A toolkit for the analysis of molecular dynamics simulations , 2011, J. Comput. Chem..

[30]  D. Alessi,et al.  The nuts and bolts of AGC protein kinases , 2010, Nature Reviews Molecular Cell Biology.

[31]  Y Nishizuka,et al.  Activation of protein kinase C by tyrosine phosphorylation in response to H2O2. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[32]  L. Tong,et al.  Inhibition of p38 MAP kinase by utilizing a novel allosteric binding site , 2002, Nature Structural Biology.

[33]  P. Caron,et al.  Classifying protein kinase structures guides use of ligand‐selectivity profiles to predict inactive conformations: Structure of lck/imatinib complex , 2007, Proteins.

[34]  Wei‐Chien Huang,et al.  c-Src-dependent tyrosine phosphorylation of IKKbeta is involved in tumor necrosis factor-alpha-induced intercellular adhesion molecule-1 expression. , 2003, The Journal of biological chemistry.

[35]  Susan S. Taylor,et al.  cAMP‐dependent protein kinase: Crystallographic insights into substrate recognition and phosphotransfer , 1994, Protein science : a publication of the Protein Society.

[36]  Carlos Fernandez-Lozano,et al.  Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models , 2015, Journal of theoretical biology.

[37]  M F Sanner,et al.  Python: a programming language for software integration and development. , 1999, Journal of molecular graphics & modelling.

[38]  D. Beach,et al.  Activation of cdc2 protein kinase during mitosis in human cells: Cell cycle-dependent phosphorylation and subunit rearrangement , 1988, Cell.

[39]  P. Seeburg,et al.  Structural mechanism for STI-571 inhibition of abelson tyrosine kinase. , 2000, Science.

[40]  Y. Qiu,et al.  Regulation of Akt/PKB Activation by Tyrosine Phosphorylation* , 2001, The Journal of Biological Chemistry.

[41]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[42]  Y. Qiu,et al.  Interaction between Src and a C-terminal Proline-rich Motif of Akt Is Required for Akt Activation* , 2003, The Journal of Biological Chemistry.

[43]  John Kuriyan,et al.  Crystal structures of the kinase domain of c-Abl in complex with the small molecule inhibitors PD173955 and imatinib (STI-571). , 2001, Cancer research.

[44]  T. Hunter,et al.  The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. , 1988, Science.

[45]  A. Sali,et al.  Impact of mutations on the allosteric conformational equilibrium. , 2013, Journal of molecular biology.

[46]  F. R. Harnden,et al.  Astronomical Data Analysis Software and Systems X , 2001 .

[47]  Susan S. Taylor,et al.  Surface comparison of active and inactive protein kinases identifies a conserved activation mechanism , 2006, Proceedings of the National Academy of Sciences.

[48]  Qiang Zhou,et al.  The 7SK small nuclear RNA inhibits the CDK9/cyclin T1 kinase to control transcription , 2001, Nature.

[49]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[50]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[51]  Todd Miller,et al.  matplotlib – A Portable Python Plotting Package , 2006 .

[52]  Susan S. Taylor,et al.  Deciphering the Structural Basis of Eukaryotic Protein Kinase Regulation , 2013, PLoS biology.

[53]  K. Khanna,et al.  Activation of ATM and Chk2 kinases in relation to the amount of DNA strand breaks , 2004, Oncogene.

[54]  Ian H. Witten,et al.  Data Mining: Practical Machine Learning Tools and Techniques, 3/E , 2014 .

[55]  L. Johnson,et al.  Active and Inactive Protein Kinases: Structural Basis for Regulation , 1996, Cell.

[56]  S. Muggleton,et al.  Protein secondary structure prediction using logic-based machine learning. , 1992, Protein engineering.

[57]  R. Battistutta,et al.  Structural and functional determinants of protein kinase CK2α: facts and open questions , 2011, Molecular and Cellular Biochemistry.

[58]  N. Gray,et al.  Rational design of inhibitors that bind to inactive kinase conformations , 2006, Nature chemical biology.

[59]  Krys J. Kochut,et al.  ProKinO: An Ontology for Integrative Analysis of Protein Kinases in Cancer , 2011, PloS one.

[60]  Wagner Meira,et al.  aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction , 2013, Bioinform..

[61]  I. D. de Esch,et al.  KLIFS: a knowledge-based structural database to navigate kinase-ligand interaction space. , 2014, Journal of medicinal chemistry.

[62]  Andrzej Kloczkowski,et al.  A global machine learning based scoring function for protein structure prediction , 2014, Proteins.

[63]  T. Hunter,et al.  The Protein Kinase Complement of the Human Genome , 2002, Science.

[64]  N. Kannan,et al.  Structural and evolutionary adaptation of rhoptry kinases and pseudokinases, a family of coccidian virulence factors , 2013, BMC Evolutionary Biology.

[65]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[66]  Doriano Fabbro,et al.  Structural biology contributions to the discovery of drugs to treat chronic myelogenous leukaemia , 2006, Acta crystallographica. Section D, Biological crystallography.

[67]  D. Barford,et al.  Mechanism of Activation of the RAF-ERK Signaling Pathway by Oncogenic Mutations of B-RAF , 2004, Cell.

[68]  Y. Shyr,et al.  Cyclin‐dependent kinase 9–cyclin K functions in the replication stress response , 2010, EMBO reports.

[69]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[70]  Hiroto Yamaguchi,et al.  Structural basis for activation of human lymphocyte kinase Lck upon tyrosine phosphorylation , 1996, Nature.