PocketAlign A Novel Algorithm for Aligning Binding Sites in Protein Structures

A fundamental task in bioinformatics involves a transfer of knowledge from one protein molecule onto another by way of recognizing similarities. Such similarities are obtained at different levels, that of sequence, whole fold, or important substructures. Comparison of binding sites is important to understand functional similarities among the proteins and also to understand drug cross-reactivities. Current methods in literature have their own merits and demerits, warranting exploration of newer concepts and algorithms, especially for large-scale comparisons and for obtaining accurate residue-wise mappings. Here, we report the development of a new algorithm, PocketAlign, for obtaining structural superpositions of binding sites. The software is available as a web-service at http://proline.physics.iisc.ernet.in/pocketalign/. The algorithm encodes shape descriptors in the form of geometric perspectives, supplemented by chemical group classification. The shape descriptor considers several perspectives with each residue as the focus and captures relative distribution of residues around it in a given site. Residue-wise pairings are computed by comparing the set of perspectives of the first site with that of the second, followed by a greedy approach that incrementally combines residue pairings into a mapping. The mappings in different frames are then evaluated by different metrics encoding the extent of alignment of individual geometric perspectives. Different initial seed alignments are computed, each subsequently extended by detecting consequential atomic alignments in a three-dimensional grid, and the best 500 stored in a database. Alignments are then ranked, and the top scoring alignments reported, which are then streamed into Pymol for visualization and analyses. The method is validated for accuracy and sensitivity and benchmarked against existing methods. An advantage of PocketAlign, as compared to some of the existing tools available for binding site comparison in literature, is that it explores different schemes for identifying an alignment thus has a better potential to capture similarities in ligand recognition abilities. PocketAlign, by finding a detailed alignment of a pair of sites, provides insights as to why two sites are similar and which set of residues and atoms contribute to the similarity.

[1]  R. Jackson,et al.  Structural Classification of Phosphate Binding Sites in Protein-Nucleotide Complexes: An Automated All-Against-All Structural Comparison Using Geometric Matching , 2003 .

[2]  M. Swindells,et al.  Protein clefts in molecular recognition and function. , 1996, Protein science : a publication of the Protein Society.

[3]  Matthew W Vetting,et al.  Mycobacterium tuberculosis dihydrofolate reductase is a target for isoniazid , 2006, Nature Structural &Molecular Biology.

[4]  G J Kleywegt,et al.  Recognition of spatial motifs in protein structures. , 1999, Journal of molecular biology.

[5]  Janet M. Thornton,et al.  Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites , 2008, ECCB.

[6]  J. Wells,et al.  Dissecting the catalytic triad of a serine protease , 1988, Nature.

[7]  M J Sternberg,et al.  Supersites within superfolds. Binding site similarity in the absence of homology. , 1998, Journal of molecular biology.

[8]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[9]  J. Thornton,et al.  Tess: A geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites , 1997, Protein science : a publication of the Protein Society.

[10]  N. Gold,et al.  Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships. , 2006, Journal of molecular biology.

[11]  K. Kinoshita,et al.  Identification of protein biochemical functions by similarity search using the molecular surface database eF‐site , 2003, Protein science : a publication of the Protein Society.

[12]  Dusanka Janezic,et al.  Protein-Protein Binding-Sites Prediction by Protein Surface Structure Conservation , 2007, J. Chem. Inf. Model..

[13]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Kalidas Yeturu,et al.  targetTB: A target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis , 2008, BMC Systems Biology.

[15]  H. Wolfson,et al.  Recognition of Functional Sites in Protein Structures☆ , 2004, Journal of Molecular Biology.

[16]  Kalidas Yeturu,et al.  PocketMatch: A new algorithm to compare binding sites in protein structures , 2008, BMC Bioinformatics.

[17]  Robert B. Russell,et al.  Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures , 2003, Nucleic Acids Res..

[18]  M. Schroeder,et al.  LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation , 2006, BMC Structural Biology.

[19]  G Ramachandraiah,et al.  Sequence and structural determinants of mannose recognition , 2000, Proteins.

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  G. Klebe,et al.  A new method to detect related function among proteins independent of sequence and fold homology. , 2002, Journal of molecular biology.

[22]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[23]  Nicola D. Gold,et al.  SitesBase: a database for structure-based protein–ligand binding site comparisons , 2005, Nucleic Acids Res..

[24]  Annabel E. Todd,et al.  From structure to function: Approaches and limitations , 2000, Nature Structural Biology.

[25]  Kalidas Yeturu,et al.  An automated framework for understanding structural variations in the binding grooves of MHC class II molecules , 2010, BMC Bioinformatics.

[26]  J. Thornton,et al.  Conformational diversity of ligands bound to proteins. , 2006, Journal of molecular biology.

[27]  Dusanka Janezic,et al.  ProBiS: a web server for detection of structurally similar protein binding sites , 2010, Nucleic Acids Res..

[28]  Janet M. Thornton,et al.  Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons , 2005, Bioinform..

[29]  P. Willett,et al.  A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. , 1994, Journal of molecular biology.

[30]  Nagasuma Chandra,et al.  PocketDepth: a new depth based algorithm for identification of ligand binding sites in proteins. , 2008, Journal of structural biology.

[31]  Dusanka Janezic,et al.  ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignment , 2010, Bioinform..

[32]  Y. Matsuo,et al.  Method for comparing the structures of protein ligand‐binding sites and application for predicting protein–drug interactions , 2008, Proteins.

[33]  A Wlodawer,et al.  Catalytic triads and their relatives. , 1998, Trends in biochemical sciences.