Quantitative structure-activity relationship modeling of peptide and protein behavior as a function of amino acid composition.

A quantitative structure-activity relationship (QSAR) modeling approach based on the location of each amino acid along three axes obtained by principal component analysis (called z scores) was extended to physical and functional properties of proteins, where the proportion of particular amino acids rather than a precise sequence is the determining factor. Coomassie Brilliant Blue spectral responses to amino acid homopolymers (R = 0.926) and proteins, either as a function of their contents of six basic and aromatic amino acids (R = 0.976) or as a function of the contributions of these amino acids to the three z scores (R = 0.935), were modeled. The ultraviolet absorbance of proteins was modeled in terms of the z score contributions of tyrosine, tryptophan, and cysteine (R = 0.995). Modeling many protein functional properties in this manner appears to be possible. An approach to modeling peptide behaviors that depend on short sequences of amino acids was also considered.