Empirical parametrization of pK values for carboxylic acids in proteins using a genetic algorithm.

Considerable effort has been devoted to the development of theoretical electrostatic methods to predict the pK values of ionizable residues in proteins. However, predictions appear often to be still at the qualitative or semi-quantitative level. We believe that, with the increasing number experimentally available pK values for proteins of known structure, an alternative approach becomes feasible: the empirical parametrization of the experimental protein pK database. Of course, in the long term, this empirical approach is no substitute for rigorous electrostatic analysis but, in the short term, it may prove to have useful predictive power and it may help to pinpoint the main structural determinants of pK values in proteins. Here we demonstrate the feasibility of the parametrization approach by fitting (using a genetic algorithm as fitting tool) the database for carboxylic acid pK values in proteins on the basis of an empirical equation that takes into account the two following kinds of effects: (1) long-range charge-charge interactions; (2) interactions of the given carboxylic acid group with its environment in the protein, which are described in terms of contributions from the different kind of atoms present in the protein (atomic contributions).