Prediction of Inter-residue Contact Clusters from Hydrophobic Cores

Motivation: Contact map is a key factor to represent a specific protein structure. As previous work reported, even a corrupted contact map can be used to reconstruct its corresponding protein structure. Thus we can predict the structure of a protein partially through the contact map prediction. To simplify the protein contact map prediction, we predict the inter-residue contact clusters centered at the groups of their neighboring contacts instead.Results: In this paper, we adopt a SVM predictor based approach to predict the inter-residue contact cluster centers. The input information of the SVM predictor includes sequence profile, evolutionary rate, and predicted secondary structure. The SVM predictor is based on hydrophobic cores that may be considered as locations of the groups of their neighboring inter-residue contacts. As a result, about 35% clustering centers of inter-residue contacts can accurately be predicted.

[1]  Robert M. MacCallum,et al.  Striped sheets and protein contact prediction , 2004, ISMB/ECCB.

[2]  C. Sander,et al.  The prediction of protein contacts from multiple sequence alignments. , 1996, Protein engineering.

[3]  Yiannis Kaznessis,et al.  Prediction of distant residue contacts with the use of evolutionary information , 2005, Proteins.

[4]  N. Ben-Tal,et al.  The ConSurf‐HSSP database: The mapping of evolutionary conservation among homologs onto PDB structures , 2004, Proteins.

[5]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[6]  M. Michael Gromiha,et al.  Role of Hydrophobic Clusters and Long-Range Contact Networks in the Folding of (α/β)8 Barrel Proteins , 2003 .

[7]  Tamotsu Noguchi,et al.  PDB-REPRDB: a database of representative protein chains from the Protein Data Bank (PDB) in 2003 , 2003, Nucleic Acids Res..

[8]  Thomas R Weikl,et al.  Loop‐closure events during protein folding: Rationalizing the shape of Φ‐value distributions , 2005, Proteins.

[9]  P. Karplus,et al.  Prediction of chain flexibility in proteins , 1985, Naturwissenschaften.

[10]  P Fariselli,et al.  Prediction of contact maps with neural networks and correlated mutations. , 2001, Protein engineering.

[11]  J. Thornton,et al.  PROCHECK: a program to check the stereochemical quality of protein structures , 1993 .

[12]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[13]  Alessandro Vullo,et al.  A two-stage approach for improved prediction of residue contact maps , 2006, BMC Bioinformatics.

[14]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[15]  Kyungsook Han,et al.  Predicting key long-range interaction sites by B-factors. , 2008, Protein and peptide letters.

[16]  Burkhard Rost,et al.  PROFcon: novel prediction of long-range contacts , 2005, Bioinform..

[17]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[18]  M. H. Zehfus,et al.  Automatic recognition of hydrophobic clusters and their correlation with protein folding units , 1995, Protein science : a publication of the Protein Society.

[19]  A. Fiser,et al.  Stabilization centers in proteins: identification, characterization and predictions. , 1997, Journal of molecular biology.

[20]  Xing-Ming Zhao,et al.  Predicting contact map using Radial Basis Function Neural Network with Conformational Energy Function , 2008, Int. J. Bioinform. Res. Appl..

[21]  Peng Chen,et al.  Predicting protein interaction sites from residue spatial sequence profile and evolution rate , 2006, FEBS Letters.

[22]  Frances M. G. Pearl,et al.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis , 2004, Nucleic Acids Res..

[23]  M Vendruscolo,et al.  Recovery of protein structure from contact maps. , 1997, Folding & design.

[24]  P. Argos,et al.  Side-chain clusters in protein structures and their role in protein folding. , 1991, Journal of molecular biology.

[25]  Hiroyuki Ogata,et al.  AAindex: Amino Acid Index Database , 1999, Nucleic Acids Res..

[26]  Pierre Baldi,et al.  Improved residue contact prediction using support vector machines and a large feature set , 2007, BMC Bioinformatics.

[27]  Hau-San Wong,et al.  Prediction of Long-range Contacts from Sequence Profile , 2007, 2007 International Joint Conference on Neural Networks.

[28]  Somenath Biswas,et al.  Evolution and similarity evaluation of protein structures in contact map space , 2005, Proteins.

[29]  Chris Sander,et al.  The HSSP database of protein structure-sequence alignments and family profiles , 1998, Nucleic Acids Res..

[30]  M Michael Gromiha,et al.  Inter-residue interactions in protein folding and stability. , 2004, Progress in biophysics and molecular biology.

[31]  M M Gromiha,et al.  Protein secondary structure prediction in different structural classes. , 1998, Protein engineering.

[32]  M. Niggemann,et al.  Exploring local and non-local interactions for protein stability by structural motif engineering. , 2000, Journal of molecular biology.

[33]  Vasant Honavar,et al.  Identification of Surface Residues Involved in Protein-Protein Interaction — A Support Vector Machine Approach , 2003 .

[34]  Pierre Baldi,et al.  Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners , 2002, ISMB.