Polyphony: superposition independent methods for ensemble-based drug discovery

BackgroundStructure-based drug design is an iterative process, following cycles of structural biology, computer-aided design, synthetic chemistry and bioassay. In favorable circumstances, this process can lead to the structures of hundreds of protein-ligand crystal structures. In addition, molecular dynamics simulations are increasingly being used to further explore the conformational landscape of these complexes. Currently, methods capable of the analysis of ensembles of crystal structures and MD trajectories are limited and usually rely upon least squares superposition of coordinates.ResultsNovel methodologies are described for the analysis of multiple structures of a protein. Statistical approaches that rely upon residue equivalence, but not superposition, are developed. Tasks that can be performed include the identification of hinge regions, allosteric conformational changes and transient binding sites. The approaches are tested on crystal structures of CDK2 and other CMGC protein kinases and a simulation of p38α. Known interaction - conformational change relationships are highlighted but also new ones are revealed. A transient but druggable allosteric pocket in CDK2 is predicted to occur under the CMGC insert. Furthermore, an evolutionarily-conserved conformational link from the location of this pocket, via the αEF-αF loop, to phosphorylation sites on the activation loop is discovered.ConclusionsNew methodologies are described and validated for the superimposition independent conformational analysis of large collections of structures or simulation snapshots of the same protein. The methodologies are encoded in a Python package called Polyphony, which is released as open source to accompany this paper [http://wrpitt.bitbucket.org/polyphony/].

[1]  Charlotte M. Deane,et al.  JOY: protein sequence-structure representation and analysis , 1998, Bioinform..

[2]  Giulio Superti-Furga,et al.  Dynamic Coupling between the SH2 and SH3 Domains of c-Src and Hck Underlies Their Inactivation by C-Terminal Tyrosine Phosphorylation , 2001, Cell.

[3]  P. Radivojac,et al.  Improved amino acid flexibility parameters , 2003, Protein science : a publication of the Protein Society.

[4]  L. Johnson,et al.  Effects of Phosphorylation of Threonine 160 on Cyclin-dependent Kinase 2 Structure and Activity* , 1999, The Journal of Biological Chemistry.

[5]  Tom L. Blundell,et al.  Comprehensive, atomic-level characterization of structurally characterized protein-protein interactions: the PICCOLO database , 2011, BMC Bioinformatics.

[6]  Z. Deng,et al.  Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. , 2004, Journal of medicinal chemistry.

[7]  L. Johnson,et al.  Phosphoprotein-protein interactions revealed by the crystal structure of kinase-associated phosphatase in complex with phosphoCDK2. , 2001, Molecular cell.

[8]  S. Teague Implications of protein flexibility for drug discovery , 2003, Nature Reviews Drug Discovery.

[9]  Hua Tang,et al.  Discovery and characterization of non-ATP site inhibitors of the mitogen activated protein (MAP) kinases. , 2011, ACS chemical biology.

[10]  Roland L. Dunbrack,et al.  A new clustering of antibody CDR loop conformations. , 2011, Journal of molecular biology.

[11]  X. Barril,et al.  Incorporating protein flexibility into docking and structure-based drug design , 2006, Expert opinion on drug discovery.

[12]  L. Johnson Protein kinase inhibitors: contributions from structure to clinical compounds , 2009, Quarterly Reviews of Biophysics.

[13]  S. Rackovsky,et al.  Differential Geometry and Polymer Conformation. 1. Comparison of Protein Conformations1a,b , 1978 .

[14]  Gisbert Schneider,et al.  Virtual screening: an endless staircase? , 2010, Nature Reviews Drug Discovery.

[15]  Susan S. Taylor,et al.  Regulation of protein kinases; controlling activity through activation segment conformation. , 2004, Molecular cell.

[16]  Jaroslav Koca,et al.  Functional flexibility of human cyclin‐dependent kinase‐2 and its evolutionary conservation , 2007, Protein science : a publication of the Protein Society.

[17]  S. Blackshaw,et al.  Profiling the Human Protein-DNA Interactome Reveals ERK2 as a Transcriptional Repressor of Interferon Signaling , 2009, Cell.

[18]  J B Findlay,et al.  Protein dynamics derived from clusters of crystal structures. , 1997, Biophysical journal.

[19]  Royston Goodacre,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2006 .

[20]  A. Liwo,et al.  Computational techniques for efficient conformational sampling of proteins. , 2008, Current opinion in structural biology.

[21]  I. Bahar,et al.  Structural changes involved in protein binding correlate with intrinsic motions of proteins in the unbound state. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[22]  E. Goldsmith,et al.  Mutations in ERK2 Binding Sites Affect Nuclear Entry* , 2007, Journal of Biological Chemistry.

[23]  Geoffrey J. Barton,et al.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench , 2009, Bioinform..

[24]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.

[25]  Nicholas Furnham,et al.  Comparative modelling by restraint-based conformational sampling , 2008, BMC Structural Biology.

[26]  Holger Gohlke,et al.  Target flexibility: an emerging consideration in drug discovery and design. , 2008, Journal of medicinal chemistry.

[27]  T. Gregory Dewey,et al.  Structure alignment based on coding of local geometric measures , 2006, BMC Bioinformatics.

[28]  John A Tainer,et al.  Crystal Structure and Mutational Analysis of the Human CDK2 Kinase Complex with Cell Cycle–Regulatory Protein CksHs1 , 1996, Cell.

[29]  X. Barril,et al.  Understanding and predicting druggability. A high-throughput method for detection of drug binding sites. , 2010, Journal of medicinal chemistry.

[30]  C. Chothia,et al.  The structure of protein-protein recognition sites. , 1990, The Journal of biological chemistry.

[31]  P. Nguyen,et al.  Energy landscape of a small peptide revealed by dihedral angle principal component analysis , 2004, Proteins.

[32]  Heather A Carlson,et al.  Exploring experimental sources of multiple protein conformations in structure-based drug design. , 2007, Journal of the American Chemical Society.

[33]  Michelle R. Arkin,et al.  Small-molecule inhibitors of protein–protein interactions: progressing towards the dream , 2004, Nature Reviews Drug Discovery.

[34]  Collaborative Computational,et al.  The CCP4 suite: programs for protein crystallography. , 1994, Acta crystallographica. Section D, Biological crystallography.

[35]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[36]  G Eisenbrand,et al.  Inhibitor binding to active and inactive CDK2: the crystal structure of CDK2-cyclin A/indirubin-5-sulphonate. , 2001, Structure.

[37]  L. Johnson,et al.  The structural basis for specificity of substrate and recruitment peptides for cyclin-dependent kinases , 1999, Nature Cell Biology.

[38]  Modesto Orozco,et al.  MoDEL (Molecular Dynamics Extended Library): a database of atomistic molecular dynamics trajectories. , 2010, Structure.

[39]  P. Jeffrey,et al.  Structural basis of cyclin-dependent kinase activation by phosphorylation , 1996, Nature Structural Biology.

[40]  Elfi Kraka,et al.  !"#c%&'(&)* ,*-."c)/*&(&)* )0 ."/12,% ,*-!&#()%("- 3"c)*-,%4 3(%1c(1%"# &* 5%)("&*# 6#&*/ (h" 81()9,("-5%)("&* 3(%1c(1%" 8*,24#&# :"(h)- 1= Introduction , 2022 .

[41]  J. Duca,et al.  Recent advances on structure-informed drug discovery of cyclin-dependent kinase-2 inhibitors. , 2009, Future medicinal chemistry.

[42]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[43]  P. Røgen,et al.  Automatic classification of protein structure by using Gauss integrals , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Mark A. Williams,et al.  Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets , 2011, BMC Bioinformatics.

[45]  N. Pavletich Mechanisms of cyclin-dependent kinase regulation: structures of Cdks, their cyclin activators, and Cip and INK4 inhibitors. , 1999, Journal of molecular biology.

[46]  R. Abseher,et al.  Essential spaces defined by NMR structure ensembles and molecular dynamics simulation show significant overlap , 1998, Proteins.

[47]  L. Johnson,et al.  The structure of cyclin E1/CDK2: implications for CDK2 activation and CDK2‐independent roles , 2005, The EMBO journal.

[48]  Joaquín Dopazo,et al.  ETE: a python Environment for Tree Exploration , 2010, BMC Bioinformatics.

[49]  Adam Godzik,et al.  FATCAT: a web server for flexible structure comparison and structure similarity searching , 2004, Nucleic Acids Res..

[50]  Elfi Kraka,et al.  Classification of Supersecondary Structures in Proteins Using the Automated Protein Structure Analysis Method , 2008, 0811.3464.

[51]  J A McCammon,et al.  Accommodating protein flexibility in computational drug design. , 2000, Molecular pharmacology.

[52]  G. Vriend,et al.  Prediction of protein conformational freedom from distance constraints , 1997, Proteins.

[53]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[54]  S. Kim,et al.  High-resolution crystal structures of human cyclin-dependent kinase 2 with and without ATP: bound waters and natural ligand as guides for inhibitor design. , 1996, Journal of medicinal chemistry.

[55]  Charles A. Laughton,et al.  A Study of CDK2 Inhibitors Using a Novel 3D‐QSAR Method Exploiting Receptor Flexibility , 2009 .

[56]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[57]  X. Barril,et al.  Unveiling the full potential of flexible receptor docking using multiple crystallographic structures. , 2005, Journal of medicinal chemistry.

[58]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[59]  Olivier Sperandio,et al.  How to choose relevant multiple receptor conformations for virtual screening: a test case of Cdk2 and normal mode analysis , 2010, European Biophysics Journal.

[60]  Amedeo Caflisch,et al.  Wordom: a program for efficient analysis of molecular dynamics simulations , 2007, Bioinform..

[61]  John Harris,et al.  Handbook of mathematics and computational science , 1998 .

[62]  Arthur J. Olson,et al.  p38alpha MAP kinase C-terminal domain binding pocket characterized by crystallographic and computational analyses. , 2009, Journal of molecular biology.

[63]  Leo S. D. Caves,et al.  Bio3d: An R Package , 2022 .

[64]  M. E. M. Noble,et al.  The structure of CDK4/cyclin D3 has implications for models of CDK activation , 2009, Proceedings of the National Academy of Sciences.

[65]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[66]  Federico Gago,et al.  Overcoming the Inadequacies or Limitations of Experimental Structures as Drug Targets by Using Computational Modeling Tools and Molecular Dynamics Simulations , 2007, ChemMedChem.

[67]  Corrado Loglisci,et al.  Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining , 2009, BMC Bioinformatics.

[68]  Stephen K Burley,et al.  Rapid-access, high-throughput synchrotron crystallography for drug discovery. , 2012, Trends in pharmacological sciences.

[69]  T. Hunter,et al.  The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification 1 , 1995, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[70]  BMC Bioinformatics , 2005 .

[71]  Tom L. Blundell,et al.  CREDO: a structural interactomics database for drug discovery , 2013, Database J. Biol. Databases Curation.

[72]  Peter M Fischer,et al.  Differential binding of inhibitors to active and inactive CDK2 provides insights for drug design. , 2006, Chemistry & biology.

[73]  Vincent Le Guilloux,et al.  Fpocket: An open source platform for ligand pocket detection , 2009, BMC Bioinformatics.

[74]  Anton Nekrutenko,et al.  Rapid and asymmetric divergence of duplicate genes in the human gene coexpression network , 2006, BMC Bioinformatics.

[75]  Emma Lees,et al.  Structure-guided discovery of cyclin-dependent kinase inhibitors. , 2008, Biopolymers.

[76]  Yuzhu Chen,et al.  N2-substituted O6-cyclohexylmethylguanine derivatives: potent inhibitors of cyclin-dependent kinases 1 and 2. , 2004, Journal of medicinal chemistry.

[77]  E. Goldsmith,et al.  Structural basis of inhibitor selectivity in MAP kinases. , 1998, Structure.

[78]  B. X. Carlson,et al.  A single glycine residue at the entrance to the first membrane-spanning domain of the gamma-aminobutyric acid type A receptor beta(2) subunit affects allosteric sensitivity to GABA and anesthetics. , 2000, Molecular pharmacology.

[79]  Adrian H Elcock,et al.  Computational sampling of a cryptic drug binding site in a protein receptor: explicit solvent molecular dynamics and inhibitor docking to p38 MAP kinase. , 2006, Journal of molecular biology.

[80]  Tom Blundell,et al.  CREDO: A Protein–Ligand Interaction Database for Drug Discovery , 2009, Chemical biology & drug design.

[81]  L. Tong,et al.  Inhibition of p38 MAP kinase by utilizing a novel allosteric binding site , 2002, Nature Structural Biology.

[82]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[83]  Michele Vendruscolo,et al.  ARABESQUE: A TOOL FOR PROTEIN STRUCTURAL COMPARISON USING DIFFERENTIAL GEOMETRY AND KNOT THEORY , 2012 .

[84]  T F Havel,et al.  The solution structure of eglin c based on measurements of many NOEs and coupling constants and its comparison with X‐ray structures , 1992, Protein science : a publication of the Protein Society.

[85]  Travis E. Oliphant,et al.  Scientific Computing with Python , 2004 .

[86]  Steven Hayward,et al.  Improvements in the analysis of domain motions in proteins from conformational change: DynDom version 1.50. , 2002, Journal of molecular graphics & modelling.

[87]  Mark Gerstein,et al.  MolMovDB: analysis and visualization of conformational change and structural flexibility , 2003, Nucleic Acids Res..

[88]  Manfredo P. do Carmo,et al.  Differential geometry of curves and surfaces , 1976 .

[89]  M. DePristo,et al.  Relation between native ensembles and experimental structures of proteins. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[90]  Kornelia Polyak,et al.  Mechanism of CDK activation revealed by the structure of a cyclinA-CDK2 complex , 1995, Nature.

[91]  Tom L. Blundell,et al.  CHORAL: a differential geometry approach to the prediction of the cores of protein structures , 2005, Bioinform..

[92]  Thomas Lengauer,et al.  Conformational analysis of alternative protein structures , 2007, Bioinform..

[93]  H. Wolfson,et al.  Flexible protein alignment and hinge detection , 2002, Proteins.

[94]  Benjamin A Hall,et al.  Dynamite: a simple way to gain insight into protein motions. , 2004, Acta crystallographica. Section D, Biological crystallography.

[95]  D. M. Jacobsen,et al.  Briefly bound to activate: transient binding of a second catalytic magnesium activates the structure and dynamics of CDK2 kinase for catalysis. , 2011, Structure.

[96]  Ivet Bahar,et al.  ProDy: Protein Dynamics Inferred from Theory and Experiments , 2011, Bioinform..

[97]  Martin E M Noble,et al.  Molecular Motions of Human Cyclin-dependent Kinase 2* , 2005, Journal of Biological Chemistry.

[98]  Huijong Han,et al.  Discovery of a potential allosteric ligand binding site in CDK2. , 2011, ACS chemical biology.

[99]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[100]  J. Wells,et al.  Searching for new allosteric sites in enzymes. , 2004, Current opinion in structural biology.