Statistical mechanical treatment of protein conformation. 5. A multistate model for specific-sequence copolymers of amino acids.

One-dimensional short-range interaction models for specific-sequence copolymers of amino acids have been developed in this series of papers. In the present paper, a multistate model (involving right-handed helical (hR), extended (epsilon), chain-reversal (R and S), left-handed helical (hL), right-handed bridge-region (zota R), left-handed bridge-region (zota L), and coil (or other) (c) states) is developed for the prediction of protein backbone conformation. This model involves ten parameters (WhR, UPSILONHR, V epsilon, VR, VS, WhL, VhL, U zota R, U zota L, and Uc) and requires a 10X10 statistical weight matrix. Assuming that the left-handed helical sequence cannot occur in proteins, this 10X10 matrix can be reduced to a 9X9 matrix with nine parameters (WhR, VhR, V epsilon, VR, VS, VhL, U zota R, U zota L, and Uc). A nearest neighbor approximation of this multistate model is also formulated; with the omission of left-handed helical sequences, and the inclusion of the left-handed bridge region in the c state, this approximate model requires a 7X7 matrix with statistical weights WhR, VhR, VS, VhL, U zota R, and Uc, expressed as values relative to the statistical weight of the epsilon state. The statistical weights for the multistate model are evaluated from the atomic coordinates of the X-ray structures of 26 native proteins. These statistical weights and the multistate model are applied in the prediction of the backbone conformations of proteins. The conformational probabilities of finding a residue in hR, epsilon, R, S, hL, zota R, or c states, defined as relative values with respect to their average values over the whole molecule, are calculated for bovine pancreatic trypsin inhibitor and clostridial flavodoxin, in order to select the most probable conformation for each residue of these proteins. The predicted results are compared to experimental observations and are discussed together with the reliability of the statistical weights. In the Appendix, the property of asymmetric nucleation of helical sequences is introduced into the (nearest neighbor) multistate model.