SCOWLP classification: Structural comparison and analysis of protein binding regions

BackgroundDetailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design.DescriptionProtein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed.We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions.The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family.ConclusionThe current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at http://www.scowlp.org.

[1]  R. Nussinov,et al.  Is allostery an intrinsic property of all dynamic proteins? , 2004, Proteins.

[2]  Hongbo Zhu,et al.  NOXclass: prediction of protein-protein interaction types , 2006, BMC Bioinformatics.

[3]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[4]  P. Sparén,et al.  Efficiency of organised and opportunistic cytological screening for cancer in situ of the cervix , 1995, British Journal of Cancer.

[5]  Jon C Ison,et al.  Survey of the geometric association of domain–domain interfaces , 2005, Proteins.

[6]  H. Wolfson,et al.  A dataset of protein-protein interfaces generated with a sequence-order-independent comparison technique. , 1996, Journal of molecular biology.

[7]  Michael Schroeder,et al.  SCOWLP: a web-based database for detailed characterization and visualization of protein interfaces , 2006, BMC Bioinformatics.

[8]  Joan Teyra,et al.  Characterization of interfacial solvent in protein complexes and contribution of wet spots to the interface description , 2007, Proteins.

[9]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[10]  H. Wolfson,et al.  A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications , 2004, Protein science : a publication of the Protein Society.

[11]  Fred P. Davis,et al.  PIBASE: a comprehensive database of structurally defined protein interfaces , 2005, Bioinform..

[12]  Michael Schroeder,et al.  SCOPPI: a structural classification of protein–protein interfaces , 2005, Nucleic Acids Res..

[13]  Alejandra Leo-Macias,et al.  A new progressive-iterative algorithm for multiple structure alignment , 2005, Bioinform..

[14]  Robert B. Russell,et al.  3did: interacting protein domains of known three-dimensional structure , 2004, Nucleic Acids Res..

[15]  Brian Everitt,et al.  Cluster analysis , 1974 .

[16]  Ozlem Keskin,et al.  PRISM: protein interactions by structural matching , 2005, Nucleic Acids Res..