Analysis of cleavage-site patterns in protein precursor sequences with a perceptron-type neural network.

A method for feature extraction from protein sequences has been developed which is based on an artificial neural filter system. Amino acid sequences are analyzed with regard to physicochemical residue properties. This alternative representation of a sequence allows an interpretation of the networks' weight values in a comprehensive and biochemically meaningful way by displaying the optimized network weights in Hinton diagrams. Signal peptidase cleavage sites of E.coli periplasmic proteins, human mitochondrial precursors and chloroplast precursors from spinach have been investigated. The network for E.coli periplasmic protein precursors classified both training and test data with 100% accuracy. The interpretation of its network weights clearly confirms the "-3,-1 rule" and the existence of a hydrophobic core region starting at position -6. Further striking features and dominant positions can be found for all three types of cleavage sites.