Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network

A neural network was trained using back propagation to recognize immunoglobulin domains from amino acid sequences. The program was designed to identify proteins exhibiting such domains with minimal rates of false positives and false negatives. The National Biomedical Research Foundation NEW protein sequences database was scanned to evaluate the performance of the program in recognizing mouse immunoglobulin sequences. The program correctly recognized 55 out of 56 mouse immunoglobulin sequences, corresponding to a recognition efficiency of 98.2% with an overall false positive rate of 7.3%. These data demonstrate that neural network-based search programs are well suited to search for sequences characterized by only a few well-conserved subsequences.

[1]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[2]  A. F. Williams,et al.  The immunoglobulin superfamily--domains for cell surface recognition. , 1988, Annual review of immunology.

[3]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Y. Pouliot,et al.  EBV Ig-like domains , 1990, Nature.

[5]  D. Lipman,et al.  Rapid similarity searches of nucleic acid and protein data banks. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  M. Karplus,et al.  Protein secondary structure prediction with a neural network. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[8]  T. D. Schneider,et al.  Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli. , 1982, Nucleic acids research.

[9]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[10]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[11]  S. Brunak,et al.  Protein secondary structure and homology by neural networks The α‐helices in rhodopsin , 1988 .

[12]  David Haussler,et al.  What Size Net Gives Valid Generalization? , 1989, Neural Computation.

[13]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[14]  J. Devereux,et al.  A comprehensive set of sequence analysis programs for the VAX , 1984, Nucleic Acids Res..