Quantitation of the area of overlap between second-derivative amide I infrared spectra to determine the structural similarity of a protein in different states.

Maintaining a native-like structure of protein pharmaceuticals during lyophilization is an important aspect of formulation. Infrared spectroscopy can be used to evaluate the effectiveness of formulations in protecting the secondary structural integrity of proteins in the dried solid. This necessitates making quantitative comparisons of the overall similarity of infrared spectra in the conformationally sensitive amide I region. We initially used the correlation coefficient r, as defined by Prestrelski et al. (Biophys. J. 1993, 65, 661-671), for this quantitation. Occasionally, we noticed that the r value did not agree with a visual assessment of the spectral similarity. In some cases this was due to an offset in baselines, which led artifactually to an unreasonably low r value. Conversely, if the spectra were baseline corrected and there existed a large similarity between peak positions, but differences in relative peak heights, the r value would be unreasonably high. Our approach to avoiding these problems is to use area-normalized second-derivative spectra. We have found that quantitating the area of overlap between area-normalized spectra provides a reliable, objective method to compare overall spectral similarity. In the current report, we demonstrate this method with selected protein spectra, which were taken from experiments where unfolding was induced by lyophilization or guanidine hydrochloride, and artificial data sets. With this analysis, we document how problems associated with calculation of the correlation coefficient, r, are avoided.