AMINO ACID SEQUENCE ANALYSIS AND DESIGN BY ARTIFICIAL NEURAL NETWORKS AND SIMULATED MOLECULAR EVOLUTION- AN EVALUATION -

The applicability of artificial neural networks as a fitness function for evolutionary protein design is discussed. In machina design of idealized signal peptidase cleavage-sites served as an example for this method in previous experiments. The results obtained and the design strategy selected are critically reviewed and evaluated. It is demonstrated that neural networks can extract relevant sequence information even from a small set of sequence data provided that the data are representative and appropriately encoded. The physicochemical properties hydrophobicity and side-chain volume showed to be useful for the description of signal peptide features. An encoding scheme for amino acids based on their individual preferences for main-chain folding angles is presented. It is motivated to employ this sequence description for extraction of protein structural features from the amino acid sequence. Furthermore, a computer-based technique for optimization of amino acid sequences termed 'simulated molecular evolution' is evaluated.

[1]  B. Dobberstein On the beaten pathway , 1994, Nature.

[2]  G. Fasman Prediction of Protein Structure and the Principles of Protein Conformation , 2012, Springer US.

[3]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[4]  G. Salmond,et al.  Membrane traffic wardens and protein secretion in gram-negative bacteria. , 1993, Trends in biochemical sciences.

[5]  B. Rost,et al.  Improved prediction of protein secondary structure by use of sequence profiles and neural networks. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[6]  G. Lorimer Role of accessory proteins in protein folding , 1992, Current Biology.

[7]  M J Sternberg,et al.  Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. , 1992, Biochemistry.

[8]  D Perlman,et al.  A putative signal peptidase recognition site and sequence in eukaryotic and prokaryotic signal peptides. , 1983, Journal of molecular biology.

[9]  Evan W. Steeg,et al.  Neural networks, adaptive optimization, and RNA secondary structure prediction , 1993 .

[10]  P. Novák,et al.  Minimum substrate sequence for signal peptidase I of Escherichia coli. , 1990, The Journal of biological chemistry.

[11]  R. Jaenicke Role of accessory proteins in protein folding , 1993 .

[12]  M Karplus,et al.  Neural networks for protein structure prediction. , 1991, Methods in enzymology.

[13]  Chris Sander,et al.  De novo design of proteins , 1991 .

[14]  Peter Walter,et al.  Model for signal sequence recognition from amino-acid sequence of 54K subunit of signal recognition particle , 1989, Nature.

[15]  D. G. George,et al.  Mutation data matrix and its uses. , 1990, Methods in enzymology.

[16]  L. Gierasch Signal sequences. , 1989, Biochemistry.

[17]  Shigeo Abe,et al.  Neural Networks and Fuzzy Systems , 1996, Springer US.

[18]  T. Steitz,et al.  Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. , 1986, Annual review of biophysics and biophysical chemistry.

[19]  J. Risler,et al.  Amino acid substitutions in structurally related proteins. A pattern recognition approach. Determination of a new and efficient scoring matrix. , 1988, Journal of molecular biology.

[20]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[21]  J. Sambrook,et al.  The functional efficiency of a mammalian signal peptide is directly related to its hydrophobicity. , 1990, The Journal of biological chemistry.

[22]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[23]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[24]  D G George,et al.  Sequence databases: an indispensible source for biotechnological research. , 1994, Journal of biotechnology.

[25]  Silas Franco dos Reis Alves,et al.  Artificial Neural Networks , 2017, Encyclopedia of Machine Learning and Data Mining.

[26]  M Gerstein,et al.  Volume changes on protein folding. , 1994, Structure.

[27]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[28]  Small ribonucleoproteins in Schizosaccharomyces pombe and Yarrowia lipolytica homologous to signal recognition particle. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[29]  D. Eisenberg Proteins. Structures and molecular properties, T.E. Creighton. W. H. Freeman and Company, New York (1984), 515, $36.95 , 1985 .

[30]  Gisbert Schneider,et al.  Evolutionary optimization in multimodal search space , 1996, Biol. Cybern..

[31]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[32]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[33]  D Botstein,et al.  Many random sequences functionally replace the secretion signal sequence of yeast invertase. , 1987, Science.

[34]  D C Richardson,et al.  Looking at proteins: representations, folding, packing, and design. Biophysical Society National Lecture, 1992. , 1992, Biophysical journal.

[35]  R. Lohmann,et al.  A neural network model for the prediction of membrane‐spanning amino acid sequences , 1994, Protein science : a publication of the Protein Society.

[36]  Gisbert Schneider,et al.  Prediction of the Secondary Structure of Proteins from the Amino Acid Sequence with Artificial Neural Networks , 1993 .

[37]  U. Wölfer,et al.  Bacteriorhodopsin precursor is processed in two steps. , 1988, European journal of biochemistry.

[38]  Kenneth M. Merz,et al.  The application of the genetic algorithm to the minimization of potential energy functions , 1993, J. Glob. Optim..

[39]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[40]  D Schomburg,et al.  Amino acid similarity coefficients for protein modeling and sequence alignment derived from main-chain folding angles. , 1991, Journal of molecular biology.

[41]  P. Argos,et al.  Potential of genetic algorithms in protein folding and protein engineering simulations. , 1992, Protein engineering.

[42]  Reinhart Heinrich,et al.  Mathematical modeling of the effects of the signal recognition particle on translation and translocation of proteins across the endoplasmic reticulum membrane. , 1987, Journal of molecular biology.

[43]  Philip J. Reeves,et al.  Membrance traffic wardens and protein secretion in Gram-negative bacteria , 1993 .

[44]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[45]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[46]  Gisbert Schneider,et al.  Artificial neural networks and simulated molecular evolution are potential tools for sequence-oriented protein design , 1994, Comput. Appl. Biosci..

[47]  Werner Ebeling,et al.  Physik der Evolutionsprozesse , 1990 .

[48]  G Schneider,et al.  The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution: de novo design of an idealized leader peptidase cleavage site. , 1994, Biophysical journal.

[49]  K. R. Woods,et al.  Prediction of protein antigenic determinants from amino acid sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Martin Vingron,et al.  Homology of 54K protein of signal-recognition particle, docking protein and two E. coli proteins with putative GTP–binding domains , 1989, Nature.

[51]  T. Creighton Proteins: Structures and Molecular Properties , 1986 .

[52]  L. Ellis,et al.  The inverse protein folding question and simulated molecular evolution. , 1994, Biophysical journal.

[53]  Gisbert Schneider,et al.  Concepts in Protein Engineering and Design: An Introduction , 1994 .

[54]  David T. Jones,et al.  De novo protein design using pairwise potentials and a genetic algorithm , 1994, Protein science : a publication of the Protein Society.

[55]  G. von Heijne,et al.  Signal peptidases in prokaryotes and eukaryotes--a new protease family. , 1992, Trends in biochemical sciences.

[56]  T. Rapoport,et al.  A membrane component of the endoplasmic reticulum that may be essential for protein translocation. , 1989, The EMBO journal.

[57]  L. Randall,et al.  A kinetic partitioning model of selective binding of nonnative proteins by the bacterial chaperone SecB. , 1991, Science.

[58]  C. DeLisi,et al.  Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteins. , 1987, Journal of molecular biology.

[59]  G Schneider,et al.  Peptide design in machina: development of artificial mitochondrial protein precursor cleavage sites by simulated molecular evolution. , 1995, Biophysical journal.

[60]  Janet M. Thornton,et al.  Lessons from analyzing protein structures , 1992 .

[61]  S. Singer,et al.  Embedded or not? Hydrophobic sequences and membranes. , 1990, Trends in biochemical sciences.

[62]  L. Randall,et al.  No specific recognition of leader peptide by SecB, a chaperone involved in protein export. , 1990, Science.

[63]  A. Zamyatnin,et al.  Protein volume in solution. , 1972, Progress in biophysics and molecular biology.

[64]  D. Eisenberg,et al.  Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. , 1983, Journal of molecular biology.

[65]  J I Gordon,et al.  Residues flanking the COOH-terminal C-region of a model eukaryotic signal peptide influence the site of its cleavage by signal peptidase and the extent of coupling of its co-translational translocation and proteolytic processing in vitro. , 1990, The Journal of biological chemistry.

[66]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[67]  D. Tollervey,et al.  E. coli 4.5S RNA is part of a ribonucleoprotein particle that has properties related to signal recognition particle , 1990, Cell.

[68]  Gunnar von Heijne,et al.  Patterns of Amino Acids near Signal‐Sequence Cleavage Sites , 1983 .

[69]  M. Schiffer,et al.  Use of helical wheels to represent the structures of proteins and to identify segments with helical potential. , 1967, Biophysical journal.

[70]  F. Hartl,et al.  The binding cascade of SecB to SecA to SecY E mediates preprotein targeting to the E. coli plasma membrane , 1990, Cell.

[71]  G A Laforet,et al.  Functional limits of conformation, hydrophobicity, and steric constraints in prokaryotic signal peptide cleavage regions. Wild type transport by a simple polymeric signal sequence. , 1991, The Journal of biological chemistry.

[72]  S. Sun,et al.  Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms , 1993, Protein science : a publication of the Protein Society.

[73]  A. Johnson Protein translocation across the ER membrane: a fluorescent light at the end of the tunnel. , 1993, Trends in biochemical sciences.

[74]  P. S. Kim,et al.  Context is a major determinant of β-sheet propensity , 1994, Nature.

[75]  Thomas Bäck,et al.  An Overview of Evolutionary Algorithms for Parameter Optimization , 1993, Evolutionary Computation.

[76]  G Schneider,et al.  Analysis of cleavage-site patterns in protein precursor sequences with a perceptron-type neural network. , 1993, Biochemical and biophysical research communications.