Modeling Disordered Regions in Proteins Using Rosetta

Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling.

[1]  Christopher J. Oldfield,et al.  Intrinsically disordered protein. , 2001, Journal of molecular graphics & modelling.

[2]  Jens Meiler,et al.  ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. , 2011, Methods in enzymology.

[3]  C Venclovas,et al.  Processing and analysis of CASP3 protein structure predictions , 1999, Proteins.

[4]  V. Uversky Intrinsically Disordered Proteins , 2000 .

[5]  David Baker,et al.  Protein Structure Prediction Using Rosetta , 2004, Numerical Computer Methods, Part D.

[6]  Oliver F. Lange,et al.  Consistent blind protein structure generation from NMR chemical shift data , 2008, Proceedings of the National Academy of Sciences.

[7]  David S Wishart,et al.  A simple method to predict protein flexibility using secondary chemical shifts. , 2005, Journal of the American Chemical Society.

[8]  D. Baker,et al.  De novo protein structure generation from incomplete chemical shift assignments , 2009, Journal of biomolecular NMR.

[9]  H. Dyson,et al.  Intrinsically unstructured proteins and their functions , 2005, Nature Reviews Molecular Cell Biology.

[10]  Marc S. Cortese,et al.  Coupled folding and binding with alpha-helix-forming molecular recognition elements. , 2005, Biochemistry.

[11]  J. S. Sodhi,et al.  Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. , 2004, Journal of molecular biology.

[12]  E. Alm,et al.  Prediction of protein-folding mechanisms from free-energy landscapes derived from native structures. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Marc S. Cortese,et al.  Coupled folding and binding with α-helix-forming molecular recognition elements , 2005 .