dConsensus: a tool for displaying domain assignments by multiple structure-based algorithms and for construction of a consensus assignment

BackgroundPartitioning of a protein into structural components, known as domains, is an important initial step in protein classification and for functional and evolutionary studies. While the systematic assignments of domains by human experts exist (CATH and SCOP), the introduction of high throughput technologies for structure determination threatens to overwhelm expert approaches. A variety of algorithmic methods have been developed to expedite this process, allowing almost instant structural decomposition into domains. The performance of algorithmic methods can approach 85% agreement on the number of domains with the consensus reached by experts. However, each algorithm takes a somewhat different conceptual approach, each with unique strengths and weaknesses. Currently there is no simple way to automatically compare assignments from different structure-based domain assignment methods, thereby providing a comprehensive understanding of possible structure partitioning as well as providing some insight into the tendencies of particular algorithms. Most importantly, a consensus assignment drawn from multiple assignment methods can provide a singular and presumably more accurate view.ResultsWe introduce dConsensus http://pdomains.sdsc.edu/dConsensus; a web resource that displays the results of calculations from multiple algorithmic methods and generates a domain assignment consensus with an associated reliability score. Domain assignments from seven structure-based algorithms - PDP, PUU, DomainParser2, NCBI method, DHcL, DDomains and Dodis are available for analysis and comparison alongside assignments made by expert methods. The assignments are available for all protein chains in the Protein Data Bank (PDB). A consensus domain assignment is built by either allowing each algorithm to contribute equally (simple approach) or by weighting the contribution of each method by its prior performance and observed tendencies. An analysis of secondary structure around domain and fragment boundaries is also available for display and further analysis.ConclusiondConsensus provides a comprehensive assignment of protein domains. For the first time, seven algorithmic methods are brought together with no need to access each method separately via a webserver or local copy of the software. This aggregation permits a consensus domain assignment to be computed. Comparison viewing of the consensus and choice methods provides the user with insights into the fundamental units of protein structure so important to the study of evolutionary and functional relationships.

[1]  Stella Veretnik,et al.  Toward consistent assignment of structural domains in proteins. , 2004, Journal of molecular biology.

[2]  Ilya N. Shindyalov,et al.  PDP: protein domain parser , 2003, Bioinform..

[3]  Tim J. P. Hubbard,et al.  SCOP database in 2004: refinements integrate structure and sequence family data , 2004, Nucleic Acids Res..

[4]  Frances M. G. Pearl,et al.  The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution , 2006, Nucleic Acids Res..

[5]  Anders Liljas,et al.  Recognition of structural domains in globular proteins , 1974 .

[6]  S. Bryant,et al.  Threading a database of protein cores , 1995, Proteins.

[7]  Dong Xu,et al.  Improving the performance of DomainParser for structural domain partition using neural network. , 2003, Nucleic acids research.

[8]  C. Sander,et al.  Parser for protein folding units , 1994, Proteins.

[9]  Lorenz Wernisch,et al.  Identifying structural domains in proteins. , 2005, Methods of biochemical analysis.

[10]  Yaoqi Zhou,et al.  DDOMAIN: Dividing structures into domains using a normalized domain–domain interaction profile , 2007, Protein science : a publication of the Protein Society.

[11]  Igor N. Berezovsky,et al.  Domain Hierarchy and closed Loops (DHcL): a server for exploring hierarchy of protein domain structure , 2008, Nucleic Acids Res..

[12]  M. Rossman,et al.  Letter: Recognition of structural domains in globular proteins. , 1974, Journal of molecular biology.

[13]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[14]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[15]  Jie Liang,et al.  Computational Methods for Protein Structure Prediction and Modeling , 2007 .

[16]  Stella Veretnik,et al.  Partitioning protein structures into domains: why is it so difficult? , 2006, Journal of molecular biology.

[17]  J M Thornton,et al.  Domain assignment for protein structures using a consensus approach: Characterization and analysis , 1998, Protein science : a publication of the Protein Society.

[18]  D. Wetlaufer Nucleation, rapid folding, and globular intrachain regions in proteins. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[19]  O. Carugo Identification of domains in protein crystal structures , 2007 .

[20]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[21]  Ilya N. Shindyalov,et al.  Computational Methods for Domain Partitioning of Protein Structures , 2007 .