PEPTIDE POTENTIAL ENERGY SURFACES AND PROTEIN FOLDING

This report outlines the utility of a 3D→1D transformation of peptide conformation, which leads to a linearized notation of protein secondary and tertiary structures that may be used for an objective description of protein folding. The method is intended to be descriptive and not to be predictive. It is established from first principles that the idealized 2D-ψ–φ map must have nine minima. It is obvious to ask whether all these nine conformations are actually occurring in proteins. The objective is to repeat a previous analysis of 258 proteins determined using program ECEPP2, with the improved ECEPP2 + polarization. An analysis is performed on 258 proteins with known X-ray structure. The proteins contain 56 495 amino-acid residues with well-defined φ and ψ angles. The minima are identified with the aid of the nine ECEPP2 minima of Ac–Ala–NHMe with φ and ψ ± 40° tolerance. ECEPP2 is improved with the inclusion of the interacting induced-dipole polarization model, SIMPLEX-MS-3 geometry optimization and the calculation of the dipole moment from the point distribution of net charges. The analysis of 258 proteins determined using ECEPP2 is repeated with the improved ECEPP2 + polarization. The relative frequency of occurrence of those conformations energetically favoured for enantioners gg, etc. in the ψ–φ map of the backbone conformations of amino acids decreases as: ga/ga > gg/gg > gg/gg >> ag/ag > aa. For the amino acids, the same preference diminishes as: Pro >> Ile > Val > Leu > Thr > Met > Ala > Glu > Phe > Trp > Tyr > Gln > Lys > Ser > Cys > Arg > Asp > His > Asn > Gly. The strong preference of Pro is in agreement with its character of α-helix and β-sheet breaker, and β-turn and random-coil former. The analysis of 258 proteins determined using ECEPP2 is repeated with the improved ECEPP2 + polarization and there is a good agreement between the two. Achiral Gly relative frequencies of occurrence are close to one. Pro is the amino acid with the greatest (gg, etc.)/(gg, etc.) preference and with the greatest influence on protein conformation. Pro is the amino acid with the largest P global conformational parameter. The original software used in the investigation is available from the author. 28 Torrens, F. et al. Resumen Este reporte reseña la utilidad de una transformación de conformación de péptido 3D→1D, que conduce a una notación linealizada de estructuras de proteínas secundarias y terciarias la cual puede ser usada para una descripción objetiva del plegamiento de proteínas. El método tiene la intención de ser descriptivo y no predictivo. Desde los primeros principios se ha establecido que el mapa 2D-ψ–φ idealizado debe tener nueve mínimos. Es obvia la pregunta, entonces, si todas las nueve conformaciones ocurren realmente en proteínas. El objetivo es repetir un análisis previo, realizado con el programa ECEPP2 en 258 proteínas, con estructura de Rayos-X conocida, utilizando el mejorado ECEPP2 + polarización. Estas proteínas contienen 56496 residuos amino ácidos con ángulos φ y ψ bien definidos. Los mínimos son identificados con la ayuda de los nueve mínimos obtenidos para Ac–Ala–NHMe por ECEPP2 con tolerancia ± 40° para φ y ψ. ECEPP2 es mejorado con la inclusión del modelo de polarización de dipolo inducido SIMPLEX-MS-3en la optimización de geometrías y el cálculo del momento dipolar a partir de la distribución puntual de cargas netas. La frecuencia relativa de ocurrencia de aquellas conformaciones energéticamente favorecidas por los enantiómeros gg, etc. en el mapa ψ–φ de las conformaciones del esqueleto de amino ácidos decrece como: ga/ga > gg/gg > gg/ gg >> ag/ag > aa. Para los amino ácidos, la misma preferencia disminuye en el sentido: Pro >> Ile > Val > Leu > Thr > Met > Ala > Glu > Phe > Trp > Tyr > Gln > Lys > Ser > Cys > Arg > Asp > His > Asn > Gly. La fuerte preferencia de Pro está de acuerdo con su carácter rompedor de hélices alfa y capas beta y formador de giros beta y ovillos aleatorios. El análisis de 258 proteínas determinadas utilizando ECPP2 se repitió utilizando el mejorado ECEPP2 + polarización y hay buen acuerdo entre los dos métodos. Las frecuencias relativas de ocurrencia de Gly aquiral son próximas a uno. Pro es el amino ácido con la mayor preferencia (gg, etc.)/(gg, etc.) y con mayor influencia en la conformación de proteínas. Pro es el amino ácido con el mayor parámetro conformacional P global . El software original utilizado en la investigación está disponible por parte del autor. Introduction and Notation Multidimensional conformational analysis (MCA) allows predicting, from the topology of the potential energy curves (PEC), the topology of the potential energy surface (PES) if the molecular system is ideal [1–3]. In the case of three-fold periodicity the 3×3 = 9 minima are energetically degenerate. This case is operative for two –CH 3 rotors as may be occurring in propane, and in molecules with two equivalent –CH 3 groups. If the component PECs continue to have three minima, but these minima are energetically non-degenerate, the resultant PES will have nine non-equivalent minima. In the case of the ideal PES, it was possible to make a statement that all nine minima have the same energy value; in the non-ideal case, it is possible to make an analogous statement that all nine minima have different energy values. However, it is not possible to predict what the energy spectrum of these nine minima might be, and what the relative stability of these minima could be. Nevertheless, by making an intuitive guess, it is suggested an order for the relative stabilities of the diagonal elements: E(O2) > E(O1) > E(O0) (1) where E is the energy. What is important to note is that PES for a single peptide unit (cf. Scheme 1) 29 Peptide Potential Energy Surfaces and Protein Folding