A comparison of SCOP and CATH with respect to domain–domain interactions

The analysis and prediction of protein–protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein–protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain–domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain–domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain–domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein–protein interaction by an estimated 6.5%. Proteins 2008. © 2007 Wiley‐Liss, Inc.

[1]  Jong H. Park,et al.  Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. , 2001, Journal of molecular biology.

[2]  H. Fromm,et al.  IMP, GTP, and 6-Phosphoryl-IMP Complexes of Recombinant Mouse Muscle Adenylosuccinate Synthetase* , 2002, The Journal of Biological Chemistry.

[3]  Hui Lu,et al.  MULTIPROSPECTOR: An algorithm for the prediction of protein–protein interactions by multimeric threading , 2002, Proteins.

[4]  Stella Veretnik,et al.  Toward consistent assignment of structural domains in proteins. , 2004, Journal of molecular biology.

[5]  Geoffrey J. Barton,et al.  SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions , 2007, Nucleic Acids Res..

[6]  Emily R Jefferson,et al.  Biological units and their effect upon the properties and prediction of protein-protein interactions. , 2006, Journal of molecular biology.

[7]  Tim J. P. Hubbard,et al.  SCOP database in 2002: refinements accommodate structural genomics , 2002, Nucleic Acids Res..

[8]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[9]  Patrick Aloy,et al.  Interrogating protein interaction networks through structural biology , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[11]  J. Thornton,et al.  PQS: a protein quaternary structure file server. , 1998, Trends in biochemical sciences.

[12]  Sameer Velankar,et al.  E-MSD: an integrated data resource for bioinformatics , 2004, Nucleic Acids Res..

[13]  Jérôme Wojcik,et al.  Protein-protein interaction map inference using interacting domain profile pairs , 2001, ISMB.

[14]  Sarah A. Teichmann,et al.  Principles of protein-protein interactions , 2002, ECCB.

[15]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[16]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[17]  Marc A. Martí-Renom,et al.  MODBASE: a database of annotated comparative protein structure models and associated resources , 2005, Nucleic Acids Res..

[18]  R. Russell,et al.  The relationship between sequence and interaction divergence in proteins. , 2003, Journal of molecular biology.

[19]  Tom L Blundell,et al.  An algorithm for predicting protein–protein interaction sites: Abnormally exposed amino acid residues and secondary structure elements , 2006, Protein science : a publication of the Protein Society.

[20]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[21]  Patrick Aloy,et al.  Ten thousand interactions for the molecular biologist , 2004, Nature Biotechnology.

[22]  Sandor Vajda,et al.  CAPRI: A Critical Assessment of PRedicted Interactions , 2003, Proteins.

[23]  A. Valencia,et al.  Prediction of protein--protein interaction sites in heterocomplexes with neural networks. , 2002, European journal of biochemistry.

[24]  D T Jones,et al.  A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. , 1999, Structure.

[25]  C. Chothia The nature of the accessible and buried surfaces in proteins. , 1976, Journal of molecular biology.

[26]  Susan Jones,et al.  SHARP2: protein-protein interaction predictions using patch analysis , 2006, Bioinform..

[27]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[28]  T. N. Bhat,et al.  The Protein Data Bank: unifying the archive , 2002, Nucleic Acids Res..