Identifying sequence regions undergoing conformational change via predicted continuum secondary structure

MOTIVATION Conformational flexibility is essential to the function of many proteins, e.g. catalytic activity. To assist efforts in determining and exploring the functional properties of a protein, it is desirable to automatically identify regions that are prone to undergo conformational changes. It was recently shown that a probabilistic predictor of continuum secondary structure is more accurate than categorical predictors for structurally ambivalent sequence regions, suggesting that such models are suited to characterize protein flexibility. RESULTS We develop a computational method for identifying regions that are prone to conformational change directly from the amino acid sequence. The method uses the entropy of the probabilistic output of an 8-class continuum secondary structure predictor. Results for 171 unique amino acid sequences with well-characterized variable structure (identified in the 'Macromolecular movements database') indicate that the method is highly sensitive at identifying flexible protein regions, but false positives remain a problem. The method can be used to explore conformational flexibility of proteins (including hypothetical or synthetic ones) whose structure is yet to be determined experimentally. AVAILABILITY The predictor, sequence data and supplementary studies are available at http://pprowler.itee.uq.edu.au/sspred/ and are free for academic use.

[1]  T. Gibson,et al.  Protein disorder prediction: implications for structural proteomics. , 2003, Structure.

[2]  V. Uversky,et al.  Why are “natively unfolded” proteins unstructured under physiologic conditions? , 2000, Proteins.

[3]  B. Rost Review: protein secondary structure prediction continues to rise. , 2001, Journal of structural biology.

[4]  Mark Gerstein,et al.  Tools and databases to analyze protein flexibility; approaches to mapping implied features onto sequences. , 2003, Methods in enzymology.

[5]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[6]  P Fariselli,et al.  An entropy criterion to detect minimally frustrated intermediates in native proteins. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Piero Fariselli,et al.  A neural-network-based method for predicting protein stability changes upon single point mutations , 2004, ISMB/ECCB.

[8]  David T. Jones,et al.  β Propellers: structural rigidity and functional diversity , 1999 .

[9]  Mikael Bodén,et al.  Prediction of protein continuum secondary structure with probabilistic models based on NMR solved structures , 2006, BMC Bioinformatics.

[10]  D. Bar-Sagi,et al.  The structural basis for the transition from Ras-GTP to Ras-GDP , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  David T. Jones,et al.  Prediction of disordered regions in proteins from position specific score matrices , 2003, Proteins.

[12]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[13]  S H Kim,et al.  Molecular switch for signal transduction: structural differences between active and inactive forms of protooncogenic ras proteins. , 1992, Science.

[14]  Arlo Z. Randall,et al.  Prediction of protein stability changes for single‐site mutations using support vector machines , 2005, Proteins.

[15]  B. Rost,et al.  Protein flexibility and rigidity predicted from sequence , 2005, Proteins.

[16]  Burkhard Rost,et al.  DSSPcont: continuous secondary structure assignments for proteins , 2003, Nucleic Acids Res..

[17]  David S Wishart,et al.  A simple method to predict protein flexibility using secondary chemical shifts. , 2005, Journal of the American Chemical Society.

[18]  S Rackovsky,et al.  On the properties and sequence context of structurally ambivalent fragments in proteins , 2003, Protein science : a publication of the Protein Society.

[19]  Mark Gerstein,et al.  MolMovDB: analysis and visualization of conformational change and structural flexibility , 2003, Nucleic Acids Res..

[20]  C. Sander,et al.  The PDBFINDER database: a summary of PDB, DSSP and HSSP information with added value , 1996, Comput. Appl. Biosci..

[21]  Zheng Yuan,et al.  Prediction of protein B‐factor profiles , 2005, Proteins.

[22]  C. A. Andersen,et al.  Continuum secondary structure captures protein flexibility. , 2002, Structure.

[23]  D. Suck,et al.  Conformational flexibility in T4 endonuclease VII revealed by crystallography: implications for substrate binding and cleavage. , 2001, Journal of molecular biology.

[24]  Heather A Carlson,et al.  Protein flexibility is an important component of structure-based drug discovery. , 2002, Current pharmaceutical design.

[25]  Mark Gerstein,et al.  Normal mode analysis of macromolecular motions in a database framework: Developing mode concentration as a useful classifying statistic , 2002, Proteins.

[26]  Malin M. Young,et al.  Predicting conformational switches in proteins , 1999, Protein science : a publication of the Protein Society.

[27]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[28]  Mark Gerstein,et al.  Normal modes for predicting protein motions: A comprehensive database assessment and associated Web tool , 2005, Protein science : a publication of the Protein Society.

[29]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[30]  D T Jones,et al.  Protein secondary structure prediction based on position-specific scoring matrices. , 1999, Journal of molecular biology.

[31]  J. Meiler PROSHIFT: Protein chemical shift prediction using artificial neural networks , 2003, Journal of biomolecular NMR.

[32]  J. Pons,et al.  RESCUE: An artificial neural network tool for the NMR spectral assignment of proteins , 1999, Journal of biomolecular NMR.