'Protein Peeling': an approach for splitting a 3D protein structure into compact fragments

MOTIVATION The object of this study is to propose a new method to identify small compact units that compose protein three-dimensional structures. These fragments, called 'protein units (PU)', are a new level of description to well understand and analyze the organization of protein structures. The method only works from the contact probability matrix, i.e. the inter Calpha-distances translated into probabilities. It uses the principle of conventional hierarchical clustering, leading to a series of nested partitions of the 3D structure. Every step aims at dividing optimally a unit into 2 or 3 subunits according to a criterion called 'partition index' assessing the structural independence of the subunits newly defined. Moreover, an entropy-derived squared correlation R is used for assessing globally the protein structure dissection. The method is compared to other splitting algorithms and shows relevant performance. AVAILABILITY An Internet server with dedicated tools is available at http://www.ebgm.jussieu.fr/~gelly/

[1]  B. Matthews X-ray Crystallographic Studies of Proteins , 1976 .

[2]  R. Nussinov,et al.  Hydrophobic folding units derived from dissimilar monomer structures and their interactions , 1997, Protein science : a publication of the Protein Society.

[3]  A. Efimov Common structural motifs in small proteins and domains , 1994, FEBS letters.

[4]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[5]  A R Panchenko,et al.  Foldons, protein structural modules, and exons. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[6]  A M Lesk,et al.  Folding units in globular proteins. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Ilya N. Shindyalov,et al.  PDP: protein domain parser , 2003, Bioinform..

[8]  C. Etchebest,et al.  A structural alphabet for local protein structures: Improved prediction methods , 2005, Proteins.

[9]  T L Blundell,et al.  An automatic method involving cluster analysis of secondary structures for the identification of domains in proteins , 1995, Protein science : a publication of the Protein Society.

[10]  Ramanathan Sowdhamini,et al.  DIAL: a web-based server for the automatic identification of structural domains in proteins , 2005, Nucleic Acids Res..

[11]  David R. Gilbert,et al.  TOPS: an enhanced database of protein structural topology , 2004, Nucleic Acids Res..

[12]  W. Delano The PyMOL Molecular Graphics System , 2002 .

[13]  J Rumbley,et al.  An amino acid code for protein folding. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[15]  Guoli Wang,et al.  PISCES: a protein sequence culling server , 2003, Bioinform..

[16]  Jean-Christophe Gelly,et al.  The KNOTTIN website and database: a new information system dedicated to the knottin scaffold , 2004, Nucleic Acids Res..

[17]  M. Go Correlation of DNA exonic regions with protein structural units in haemoglobin , 1981, Nature.

[18]  A. G. Brevern,et al.  'Hybrid Protein Model' for optimally defining 3D protein structure fragments , 2003, Bioinform..

[19]  C. Etchebest,et al.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks , 2000, Proteins.

[20]  Ruth Nussinov,et al.  fragment folding and assembly Reducing the computational complexity of protein folding via , 2002 .

[21]  B. L. Sibanda,et al.  Accommodating sequence changes in β-hairpins in proteins , 1993 .

[22]  Dong Xu,et al.  Improving the performance of DomainParser for structural domain partition using neural network. , 2003, Nucleic acids research.

[23]  Ruth Nussinov,et al.  Protein structure prediction via combinatorial assembly of sub-structural units , 2003, ISMB.

[24]  A V Finkelstein,et al.  The classification and origins of protein folding patterns. , 1990, Annual review of biochemistry.

[25]  J. Janin,et al.  Structural domains in proteins and their role in the dynamics of protein function. , 1983, Progress in biophysics and molecular biology.

[26]  A. Efimov Structural trees for protein superfamilies , 1997, Proteins.