Investigation of the proteins folding rates and their properties of amino acid networks

Abstract The mechanism of protein folding is an important problem in molecular biology. It is usually thought that protein folding is a complex system process related to the entire molecule. In this article, we have investigated 78 structures of folding proteins in native state, from complex networks perspective, to understand the role of topological parameters in proteins folding kinetics. The 31 parameters were calculated based on the amino acid networks of the folding proteins. The relationship between those parameters and protein folding rates has been systematically analyzed. Our results show that the significant parameters between two-state and multi-state folding proteins correlate well with the folding rates of proteins. It is also found that classifying the proteins into different classes can improve the correlation coefficient from 0.926 to 0.983 between the parameters and folding rates of two- and multi-state proteins, respectively. Genetic Algorithms–Multiple Linear Regression (GA–MLR) was adopted to select the best subset parameters from the whole 31 parameters to construct the MLR model to avoid over-fitting. Our methods show a correlation coefficient of 0.921 for the all folding proteins based on the classification of the folding proteins. The results indicate that the general topological parameters of the amino acids networks of the folding proteins can effectively represent the structural and functional properties, such as the rates of folding.

[1]  Jie Liang,et al.  Predicting protein folding rates from geometric contact and amino acid sequence , 2008, Protein Science.

[2]  Lourdes Santana,et al.  Proteomics, networks and connectivity indices , 2008, Proteomics.

[3]  Natalya S. Bogatyreva,et al.  KineticDB: a database of protein folding kinetics , 2008, Nucleic Acids Res..

[4]  Kevin W Plaxco,et al.  How the folding rate constant of simple, single-domain proteins depends on the number of native contacts , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Cristian Micheletti,et al.  Prediction of folding rates and transition‐state placement from native‐state geometry , 2002, Proteins.

[6]  Liang-Tsung Huang,et al.  Analysis and prediction of protein folding rates using quadratic response surface models , 2008, J. Comput. Chem..

[7]  Ganesh Bagler,et al.  Assortative mixing in Protein Contact Networks and protein folding kinetics , 2007, Bioinform..

[8]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[9]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[10]  Kevin W Plaxco,et al.  Contact order revisited: Influence of protein size on the folding rate , 2003, Protein science : a publication of the Protein Society.

[11]  Shan Chang,et al.  Construction and application of the weighted amino acid network based on energy. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  D. Baker,et al.  Contact order, transition state placement and the refolding rates of single domain proteins. , 1998, Journal of molecular biology.

[13]  E. Shakhnovich,et al.  Topological determinants of protein folding , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M Karplus,et al.  Small-world view of the amino acids that play a key role in protein folding. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Emidio Capriotti,et al.  K-Fold: a tool for the prediction of the protein folding kinetic order and rate , 2007, Bioinform..

[16]  Marco Punta,et al.  Protein folding rates estimated from contact predictions. , 2005, Journal of molecular biology.

[17]  Dmitry N Ivankov,et al.  Chain length is the main determinant of the folding rate for proteins with three‐state folding kinetics , 2003, Proteins.

[18]  M. Gromiha,et al.  Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: application of long-range order to folding rate prediction. , 2001, Journal of molecular biology.

[19]  Kevin W Plaxco,et al.  The topomer search model: A simple, quantitative theory of two‐state protein folding kinetics , 2003, Protein science : a publication of the Protein Society.

[20]  Humberto González Díaz,et al.  Comparative Study of Topological Indices of Macro/Supramolecular RNA Complex Networks , 2008, J. Chem. Inf. Model..

[21]  Aaron R. Dinner,et al.  The roles of stability and contact order in determining protein folding rates , 2001, Nature Structural Biology.

[22]  A. Finkelstein,et al.  Prediction of protein folding rates from the amino acid sequence-predicted secondary structure , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  S. Kundu,et al.  Hydrophobic, hydrophilic, and charged amino acid networks within protein. , 2006, Biophysical journal.

[24]  D Baker,et al.  Topology, stability, sequence, and length: defining the determinants of two-state protein folding kinetics. , 2000, Biochemistry.

[25]  Ashley M. Buckle,et al.  Protein Folding Database (PFD 2.0): an online environment for the International Foldeomics Consortium , 2006, Nucleic Acids Res..

[26]  T. Hancock,et al.  A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies , 2005 .

[27]  A. Atilgan,et al.  Small-world communication of residues and significance for protein dynamics. , 2003, Biophysical journal.

[28]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[29]  Hongyi Zhou,et al.  Folding rate prediction using total contact distance. , 2002, Biophysical journal.