What Is the Minimum Number of Residues to Determine the Secondary Structural State?

The failure of protein secondary structural prediction is commonly attributed to the neglect of long-range interactions. The question is, what is the minimum length of subsequence required to determine the central secondary structural state, stabilized only by local interactions? In the present work, the 20 amino acids were classified into eight groups to analyze systematically the relationship between the length and secondary structural state of subsequences in the PDB database. It was found that the fraction of subsequences with a unique central secondary structural state increases with increasing length, and the minimum length of subsequence required to determine the central secondary structural state is about 14–17 residues. The low accuracy of secondary structure prediction does not result from the neglect of long-range interactions, but may result from the limitation of the available protein database size or prediction algorithm.