The Prediction of the Degree of Exposure to Solvent of Amino Acid Residues Via Genetic Programming

In this paper I evolve programs that predict the degree of exposure to solvent (the buriedness) of amino acid residues given only the primary structure. I use genetic programming (Koza 1992; Koza 1994) to evolve programs that take as input the primary structure and that output the buriedness of each residue. I trained these programs on a set of 82 proteins from the Brookhaven Protein Data Bank (PDB) (Bernstein et al. 1977) and cross-validated them on a separate testing set of 40 proteins, also from the PDB. The best program evolved had a correlation of 0.434 between the predicted and observed buriednesses on the testing set.