Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets

BackgroundProtein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible.ResultsWe have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding.ConclusionsThrough post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.

[1]  Andreas Prlic,et al.  Sequence analysis , 2003 .

[2]  F. Ferrè,et al.  Protein surface similarities: a survey of methods to describe and compare protein surfaces , 2000, Cellular and Molecular Life Sciences CMLS.

[3]  D. Levitt,et al.  POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids. , 1992, Journal of molecular graphics.

[4]  Vincent Le Guilloux,et al.  fpocket: online tools for protein ensemble pocket detection and tracking , 2010, Nucleic Acids Res..

[5]  J. Thornton,et al.  Shape variation in protein binding pockets and their ligands. , 2007, Journal of molecular biology.

[6]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[7]  J. Berg,et al.  Molecular dynamics simulations of biomolecules , 2002, Nature Structural Biology.

[8]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[9]  Conrad C. Huang,et al.  UCSF Chimera—A visualization system for exploratory research and analysis , 2004, J. Comput. Chem..

[10]  R. Laskowski SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. , 1995, Journal of molecular graphics.

[11]  Pieter F. W. Stouten,et al.  Fast prediction and visualization of protein binding pockets with PASS , 2000, J. Comput. Aided Mol. Des..

[12]  Bingding Huang,et al.  MetaPocket: a meta approach to improve protein ligand binding site prediction. , 2009, Omics : a journal of integrative biology.

[13]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[14]  Mona Singh,et al.  Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure , 2009, PLoS Comput. Biol..

[15]  F. Javier Luque,et al.  MDpocket: open-source cavity detection and characterization on molecular dynamics trajectories , 2011, Bioinform..

[16]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[17]  Philippe Roche,et al.  Atomic Analysis of Protein-Protein Interfaces with Known Inhibitors: The 2P2I Database , 2010, PloS one.

[18]  G. Vriend,et al.  Prediction of protein conformational freedom from distance constraints , 1997, Proteins.

[19]  V. Helms,et al.  Transient pockets on protein surfaces involved in protein-protein interaction. , 2007, Journal of medicinal chemistry.

[20]  Christopher L. McClendon,et al.  Reaching for high-hanging fruit in drug discovery at protein–protein interfaces , 2007, Nature.

[21]  Thomas A. Halgren,et al.  Identifying and Characterizing Binding Sites and Assessing Druggability , 2009, J. Chem. Inf. Model..

[22]  G. Schneider,et al.  PocketPicker: analysis of ligand binding-sites with shape descriptors , 2007, Chemistry Central Journal.

[23]  J. Melo,et al.  Chronic myeloid leukemia--advances in biology and new approaches to treatment. , 2003, The New England journal of medicine.

[24]  M. Karplus,et al.  Dynamics of folded proteins , 1977, Nature.

[25]  R. Nussinov,et al.  Principles of protein-protein interactions: what are the preferred ways for proteins to interact? , 2008, Chemical reviews.

[26]  M Hendlich,et al.  LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins. , 1997, Journal of molecular graphics & modelling.

[27]  Richard M. Jackson,et al.  Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites , 2005, Bioinform..

[28]  Michelle R. Arkin,et al.  Binding of small molecules to an adaptive protein–protein interface , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  H. Edelsbrunner,et al.  Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design , 1998, Protein science : a publication of the Protein Society.

[30]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[31]  Fred P. Davis,et al.  The Overlap of Small Molecule and Protein Binding Sites within Families of Protein Structures , 2010, PLoS Comput. Biol..

[32]  Volkhard Helms,et al.  What induces pocket openings on protein surface patches involved in protein–protein interactions? , 2009, J. Comput. Aided Mol. Des..

[33]  M. Karplus,et al.  Normal modes for specific motions of macromolecules: application to the hinge-bending mode of lysozyme. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Bert L de Groot,et al.  Geometry-based sampling of conformational transitions in proteins. , 2007, Structure.

[35]  Lukasz Kurgan,et al.  A critical comparative assessment of predictions of protein-binding sites for biologically relevant organic compounds. , 2011, Structure.

[36]  D. Shugar,et al.  Dynamics of proteins and nucleic acids , 1989 .

[37]  A. Konagurthu,et al.  MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.

[38]  S. J. Campbell,et al.  Ligand binding: functional site location, similarity and docking. , 2003, Current opinion in structural biology.

[39]  D. Zerbino,et al.  An analysis of core deformations in protein superfamilies. , 2005, Biophysical journal.

[40]  Tatyana G. Karabencheva-Christova,et al.  Dynamics of proteins and nucleic acids , 2013 .