Predicting protein-ligand binding specificity based on ensemble clustering

Protein structure comparison algorithms can be used to identify distantly related proteins or to categorize differences in binding specificities. When they are presented in different conformations, distantly related proteins can go unrecognized unless flexible representations of whole protein structures are used. Such representations offer a sophisticated description of backbone motion, but they do not incorporate the potential motion of every atom. Thus, existing representations, both rigid and flexible, cannot compensate for atomic motions that can make binding sites with similar binding preferences appear different. To bridge this gap, this paper presents a tool for comparing protein binding sites despite conformational changes in the binding site. Our method employs ensemble clustering techniques to incorporate the diversity of binding site variations observed in conformational samples of binding site motion. We applied the method on protein conformations of serine proteases and enolase superfamilies. Our results demonstrate that this approach can distinguish proteins with similar binding preferences in the presence of considerable binding site flexibility.

[1]  Brian Yuan Chen,et al.  Variational Bayesian clustering on protein cavity conformations for detecting influential amino acids , 2014, BCB.

[2]  William R. Taylor,et al.  Flexible Secondary Structure Based Protein Structure Comparison Applied to the Detection of Circular Permutation , 2006, J. Comput. Biol..

[3]  Adam Godzik,et al.  Multiple flexible structure alignment using partial order graphs , 2005, Bioinform..

[4]  E. Webb Enzyme nomenclature 1992. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. , 1992 .

[5]  G. H. Reed,et al.  The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the alpha-protons of carboxylic acids. , 1996, Biochemistry.

[6]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[7]  Jie Liang,et al.  Structural signatures of enzyme binding pockets from order-independent surface alignment: a study of metalloendopeptidase and NAD binding proteins. , 2011, Journal of molecular biology.

[8]  L Szilágyi,et al.  Electrostatic complementarity within the substrate-binding pocket of trypsin. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[9]  R. Russell,et al.  Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution. , 1998, Journal of molecular biology.

[10]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[11]  Ruth Nussinov,et al.  FlexProt: Alignment of Flexible Protein Structures Without a Predefinition of Hinge Regions , 2004, J. Comput. Biol..

[12]  Berk Hess,et al.  P-LINCS:  A Parallel Linear Constraint Solver for Molecular Simulation. , 2008, Journal of chemical theory and computation.

[13]  A. Dillmann Enzyme Nomenclature , 1965, Nature.

[14]  Andrzej Joachimiak,et al.  Protein Functional Surfaces: Global Shape Matching and Local Spatial Alignments of Ligand Binding Sites , 2008, BMC Structural Biology.

[15]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[16]  H. Wolfson,et al.  Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[17]  R. Nussinov,et al.  Molecular shape comparisons in searches for active sites and functional similarity. , 1998, Protein engineering.

[18]  G L Kenyon,et al.  Mechanism of the reaction catalyzed by mandelate racemase: structure and mechanistic properties of the D270N mutant. , 1995, Biochemistry.

[19]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[20]  Roberto Mosca,et al.  RAPIDO: a web server for the alignment of protein structures in the presence of conformational changes , 2008, Nucleic Acids Res..

[21]  Carsten Kutzner,et al.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. , 2008, Journal of chemical theory and computation.

[22]  Brian Yuan Chen,et al.  A flexible volumetric comparison of protein cavities can reveal patterns in ligand binding specificity , 2014, BCB.

[23]  B. Luisi,et al.  Crystal structure of the Escherichia coli RNA degradosome component enolase. , 2001, Journal of molecular biology.

[24]  Joydeep Ghosh,et al.  Cluster Ensembles A Knowledge Reuse Framework for Combining Partitionings , 2002, AAAI/IAAI.

[25]  Dusanka Janezic,et al.  ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment , 2010, Bioinform..

[26]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[27]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[28]  Jie Liang,et al.  CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues , 2006, Nucleic Acids Res..

[29]  K Morihara,et al.  Comparison of the specificities of various neutral proteinases from microorganisms. , 1968, Archives of biochemistry and biophysics.

[30]  M. Parrinello,et al.  Polymorphic transitions in single crystals: A new molecular dynamics method , 1981 .

[31]  Graham Richards,et al.  Intermolecular forces , 1978, Nature.

[32]  Barry Honig,et al.  VASP: A Volumetric Analysis of Surface Properties Yields Insights into Protein-Ligand Binding Specificity , 2010, PLoS Comput. Biol..

[33]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[34]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[35]  Lydia E. Kavraki,et al.  Combinatorial Clustering of Residue Position Subsets Predicts Inhibitor Affinity across the Human Kinome , 2013, PLoS Comput. Biol..

[36]  Peter Willett,et al.  Comparison of protein surfaces using a genetic algorithm , 1997, J. Comput. Aided Mol. Des..

[37]  Ralf Zimmer,et al.  Vorolign - fast structural alignment using Voronoi contacts , 2007, Bioinform..

[38]  Lenore Cowen,et al.  Matt: Local Flexibility Aids Protein Multiple Structure Alignment , 2008, PLoS Comput. Biol..

[39]  P. Sneath,et al.  Numerical Taxonomy , 1962, Nature.

[40]  Barry Honig,et al.  GRASP2: visualization, surface properties, and electrostatics of macromolecular structures and sequences. , 2003, Methods in enzymology.

[41]  R. Nussinov,et al.  How different are structurally flexible and rigid binding sites? Sequence and structural features discriminating proteins that do and do not undergo conformational change upon ligand binding. , 2007, Journal of molecular biology.

[42]  S. Nosé,et al.  Constant pressure molecular dynamics for molecular systems , 1983 .

[43]  Philip E. Bourne,et al.  A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites , 2007, BMC Bioinformatics.

[44]  K. Kinoshita,et al.  Identification of the ligand binding sites on the molecular surface of proteins , 2005, Protein science : a publication of the Protein Society.

[45]  N P Willassen,et al.  Purification and characterization of pancreatic elastase from North Atlantic salmon (Salmo salar). , 1998, Molecular marine biology and biotechnology.

[46]  Lydia E. Kavraki,et al.  The MASH Pipeline for Protein Function Prediction and an Algorithm for the Geometric Refinement of 3D Motifs , 2007, J. Comput. Biol..

[47]  K Morihara,et al.  Comparison of the specificities of various serine proteinases from microorganisms. , 1969, Archives of biochemistry and biophysics.