论文信息 - Use of Inductive Logic Programming to Learn Principles of Protein Structure

Use of Inductive Logic Programming to Learn Principles of Protein Structure

Inductive Logic Programming (ILP) has been applied to learn rules which characterise protein folds. Several representations for the background set have been explored and the results have been interpreted in their biological context. In this paper, we present new results obtained with a background set containing information about protein topology. The new rules are more descriptive than the previous ones, i.e. where previous rules represented local motifs, often associated with functional regions, the new rules represent more complete descriptions, often similar to the descriptions found in SCOP. Cross-validation experiments were conducted for the 20 most populated folds. The overall cross-validated accuracy was found to be 75.1 ± 1.6 % for the more limited background knowledge, and 82.1± 1.4 % whith additional information.

[1] L. Pauling,et al. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[2] A. Lesk,et al. Determinants of a protein fold. Unique features of the globin amino acid sequences. , 1987, Journal of molecular biology.

[3] C. Branden,et al. Introduction to protein structure , 1991 .

[4] P. Kraulis. A program to produce both detailed and schematic plots of protein structures , 1991 .

[5] David T. Jones,et al. Protein superfamilles and domain superfolds , 1994, Nature.

[6] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[7] J. Thornton,et al. PROMOTIF—A program to identify and analyze structural motifs in proteins , 1996, Protein science : a publication of the Protein Society.

[8] C. Chothia,et al. Understanding protein structure: using scop for fold interpretation. , 1996, Methods in enzymology.