A Neural Network to Detect Homologies in Proteins

In order to detect the presence and location of immunoglobulin (Ig) domains from amino acid sequences we built a system based on a neural network with one hidden layer trained with back propagation. The program was designed to efficiently identify proteins exhibiting such domains, characterized by a few localized conserved regions and a low overall homology. When the National Biomedical Research Foundation (NBRF) NEW protein sequence database was scanned to evaluate the program's performance, we obtained very low rates of false negatives coupled with a moderate rate of false positives.

[1]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[2]  T. D. Schneider,et al.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.

[3]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[4]  J. Devereux,et al.  A comprehensive set of sequence analysis programs for the VAX , 1984, Nucleic Acids Res..

[5]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[6]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[7]  A. F. Williams,et al.  The immunoglobulin superfamily--domains for cell surface recognition. , 1988, Annual review of immunology.

[8]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[9]  Yoshua Bengio,et al.  Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge , 1989, NIPS.

[10]  Huaichun Wang,et al.  Superfamily expands , 1989, Nature.

[11]  Yoshua Bengio,et al.  Programmable execution of multi-layered networks for automatic speech recognition , 1989, CACM.

[12]  O. Blaschuk,et al.  Identification of a conserved region common to cadherins and influenza strain A hemagglutinins. , 1990, Journal of molecular biology.

[13]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .