Structural information involved in the interpretation of the stepwise protein folding process

Abstract: Calculating the quantity of information present in each step of the protein folding process suggests that the multistep approach requires less information than the one-step model. Quantitative analysis reveals that the amino acids present in the polypeptide chain do not carry enough information to accurately predict the values of the angles Φ and Ψ in folded proteins. This conclusion results from comparing the amount of information carried by amino acids with the quantity of information necessary to determine Φ and Ψ, taking the complete Ramachandran map as the conformational space. It is shown that the two-step model (comprising two stages, the ES and LS) requires less information, owing to the fact that the final predictions of the angles Φ and Ψ can be based on a preexisting ES structure. Analysis based on information theory points to particular zones of the Ramachandran map that appear to play an important role in the context of protein structure prediction.

[1]  Zoran Obradovic,et al.  Unsupervised Integration of Multiple Protein Disorder Predictors: The Method and Evaluation on CASP7, CASP8 and CASP9 Data , 2011, Proteome Science.

[2]  Irena Roterman-Konieczna,et al.  SPI - Structure predictability index for protein sequences , 2004, Silico Biol..

[3]  A. Stamps Entropy and Environmental Mystery , 2007, Perceptual and motor skills.

[4]  Cheng-Jian Lin,et al.  An effective hybrid of hill climbing and genetic algorithm for 2D triangular protein structure prediction , 2011, Proteome Science.

[5]  A. Adler,et al.  A measure of the information content of EIT data , 2008, Physiological Measurement.

[6]  Xiaoxing Liu,et al.  An Entropy-based gene selection method for cancer classification using microarray data , 2005, BMC Bioinformatics.

[7]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[8]  Ashok Reddy Dinasarapu,et al.  Comparative analysis of core promoter region: Information content from mono and dinucleotide substitution matrices , 2006, Comput. Biol. Chem..

[9]  Jeremy M Moix,et al.  Dihedral-angle information entropy as a gauge of secondary structure propensity. , 2006, Biophysical journal.

[10]  E Desurvire Fundamental information-density limits in optically amplified transmission: an entropy analysis. , 2000, Optics letters.

[11]  Daniel B. Roche,et al.  Automated tertiary structure prediction with accurate local model quality assessment using the intfold‐ts method , 2011, Proteins.

[12]  Marina Vannucci,et al.  Information theory provides a comprehensive framework for the evaluation of protein structure predictions , 2009, Proteins.

[13]  Jianwen Fang,et al.  Predicting residue-residue contacts using random forest models , 2011, Bioinform..

[14]  S. Rackovsky,et al.  Optimally informative backbone structural propensities in proteins , 2002, Proteins.

[15]  Leszek Konieczny,et al.  Conformational subspace in simulation of early‐stage protein folding , 2004, Proteins.

[16]  L. C. Martin,et al.  Using information theory to search for co-evolving residues in proteins , 2005, Bioinform..

[17]  H. Sakuraba,et al.  First Crystal Structure of l-Lysine 6-Dehydrogenase as an NAD-dependent Amine Dehydrogenase* , 2010, The Journal of Biological Chemistry.

[18]  Jinbo Xu,et al.  Raptorx: Exploiting structure information for protein alignment by statistical inference , 2011, Proteins.

[19]  Mohamed F Ghalwash,et al.  Uncertainty analysis in protein disorder prediction. , 2012, Molecular bioSystems.

[20]  D. C. Sullivan,et al.  Information content of molecular structures. , 2003, Biophysical journal.

[21]  Andreas Winter,et al.  Partial quantum information , 2005, Nature.

[22]  Sun Kim,et al.  Sequence-Based Enzyme Catalytic Domain Prediction Using Clustering and Aggregated Mutual Information Content , 2011, 2011 IEEE First International Conference on Healthcare Informatics, Imaging and Systems Biology.

[23]  H. Fromm,et al.  Refined crystal structures of unligated adenylosuccinate synthetase from Escherichia coli. , 1995, Journal of molecular biology.

[24]  Jimin Pei,et al.  An automatic method for CASP9 free modeling structure prediction assessment , 2011, Bioinform..

[25]  Belhadri Messabih,et al.  Profiles and Majority Voting-Based Ensemble Method for Protein Secondary Structure Prediction , 2011, Evolutionary bioinformatics online.

[26]  Jun Ni,et al.  Protein structural class prediction based on an improved statistical strategy , 2008, BMC Bioinformatics.

[27]  S Rackovsky,et al.  Optimized representations and maximal information in proteins , 2000, Proteins.

[28]  Tamiki Komatsuzaki,et al.  Topographical complexity of multidimensional energy landscapes , 2006, Proceedings of the National Academy of Sciences.

[29]  Joaquín Dopazo,et al.  SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants , 2011, Nucleic Acids Res..

[30]  Andreas Wagner,et al.  From bit to it: How a complex metabolic network transforms information into living matter , 2007, BMC Systems Biology.

[31]  J. Thornton,et al.  Prediction of strand pairing in antiparallel and parallel β‐sheets using information theory , 2002, Proteins.

[32]  Theodore Kolokolnikov,et al.  Introduction: dissipative localized structures in extended systems. , 2007, Chaos.

[33]  Jerry W. Tsai,et al.  An Information Measure of the Quality of Protein Secondary Structure Prediction , 2008, J. Comput. Biol..

[34]  R. Baayen,et al.  Putting the bits together: an information theoretical perspective on morphological processing , 2004, Cognition.

[35]  Bruce Tidor,et al.  MIST: Maximum Information Spanning Trees for dimension reduction of biological data sets , 2009, Bioinform..

[36]  Jacquelyn S. Fetrow,et al.  Using Information Theory to Discover Side Chain Rotamer Classes: Analysis of the Effects of Local Backbone Structure , 1999, Pacific Symposium on Biocomputing.

[37]  Kay Hamacher,et al.  Information theoretical measures to analyze trajectories in rational molecular design , 2007, J. Comput. Chem..

[38]  Alfonso Valencia,et al.  Automated Alphabet Reduction for Protein Datasets , 2009, BMC Bioinformatics.

[39]  Z. Weng,et al.  Optimizing protein representations with information theory. , 2004, Genome informatics. International Conference on Genome Informatics.

[40]  Thomas D. Schneider,et al.  Correlation between binding rate constants and individual information of E. coli Fis binding sites , 2007, Nucleic acids research.

[41]  Patrick Hayden,et al.  Quantum Information: Putting certainty in the bank , 2005, Nature.

[42]  Jeerayut Chaijaruwanich,et al.  Prediction of the disulphide bonding state of cysteines in proteins using Conditional Random Fields , 2011, Int. J. Data Min. Bioinform..