Improvements in protein secondary structure prediction by an enhanced neural network.

Computational neural networks have recently been used to predict the mapping between protein sequence and secondary structure. They have proven adequate for determining the first-order dependence between these two sets, but have, until now, been unable to garner higher-order information that helps determine secondary structure. By adding neural network units that detect periodicities in the input sequence, we have modestly increased the secondary structure prediction accuracy. The use of tertiary structural class causes a marked increase in accuracy. The best case prediction was 79% for the class of all-alpha proteins. A scheme for employing neural networks to validate and refine structural hypotheses is proposed. The operational difficulties of applying a learning algorithm to a dataset where sequence heterogeneity is under-represented and where local and global effects are inadequately partitioned are discussed.

[1]  J Deisenhofer,et al.  Crystallographic refinement and atomic models of the intact immunoglobulin molecule Kol and its antigen-binding fragment at 3.0 A and 1.0 A resolution. , 1980, Journal of molecular biology.

[2]  J C Fontecilla-Camps,et al.  Structure of variant-3 scorpion neurotoxin from Centruroides sculpturatus Ewing, refined at 1.8 A resolution. , 1983, Journal of molecular biology.

[3]  T. A. Jones,et al.  Structure of a triclinic ternary complex of horse liver alcohol dehydrogenase at 2.9 A resolution. , 1981, Journal of molecular biology.

[4]  V. Z Pletnev,et al.  Actinoxanthin Structure at the Atomic Level (Russian) , 1983 .

[5]  Karl D. Hardman,et al.  Structure of concanavalin A at 2.4-Ang resolution , 1972 .

[6]  W Furey,et al.  Structure of a novel Bence-Jones protein (Rhe) fragment at 1.6 A resolution. , 1983, Journal of molecular biology.

[7]  L. M. Amzel,et al.  Molecular‐replacement structure of guinea pig IgGl pFc' refined at 3.1Å resolution , 1985 .

[8]  H. Watson,et al.  Twinning in crystals of human skeletal muscle D-glyceraldehyde-3-phosphate dehydrogenase. , 1976, Journal of molecular biology.

[9]  K. Moffat,et al.  The refined structure of vitamin D-dependent calcium-binding protein from bovine intestine. Molecular details, ion binding, and implications for the structure of other calcium-binding proteins. , 1986, The Journal of biological chemistry.

[10]  Terrence G. Oas,et al.  A peptide model of a protein folding intermediate , 1988, Nature.

[11]  R A Bradshaw,et al.  Structure of porcine heart cytoplasmic malate dehydrogenase: combining X-ray diffraction and chemical sequence data in structural studies. , 1987, Biochemistry.

[12]  G. Cohen,et al.  Structure and refinement at 1.8 A resolution of the aspartic proteinase from Rhizopus chinensis. , 1987, Journal of molecular biology.

[13]  B C Finzel,et al.  Crystal structure of yeast cytochrome c peroxidase refined at 1.7-A resolution. , 1984, The Journal of biological chemistry.

[14]  David S. Moss,et al.  Ribonuclease-A: least-squares refinement of the structure at 1.45 Å resolution , 1982 .

[15]  R. Huber,et al.  The molecular structure of a dimer composed of the variable portions of the Bence-Jones protein REI refined at 2.0-A resolution. , 1975, Biochemistry.

[16]  W. Hol,et al.  Structure of bovine liver rhodanese. I. Structure determination at 2.5 A resolution and a comparison of the conformation and sequence of its two domains. , 1978, Journal of molecular biology.

[17]  J. Kraut,et al.  Two-Angstrom crystal structure of oxidized Chromatium high potential iron protein. , 1976, The Journal of biological chemistry.

[18]  R. M. Abarbanel,et al.  Turn prediction in proteins using a pattern-matching approach. , 1986, Biochemistry.

[19]  R H Lathrop,et al.  Pattern descriptors and the unidentified reading frame 6 human mtDNA dinucleotide‐binding site , 1988, Proteins.

[20]  J. W. Campbell,et al.  The atomic structure of crystalline porcine pancreatic elastase at 2.5 A resolution: comparisons with the structure of alpha-chymotrypsin. , 1976, Journal of molecular biology.

[21]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[22]  Tom Blundell,et al.  The active site of aspartic proteinases , 1991, FEBS letters.

[23]  Hans Neurath,et al.  The structure of rat mast cell protease II at 1.9-A resolution. , 1984, Biochemistry.

[24]  L. H. Jensen,et al.  Structure of Peptococcus aerogenes ferredoxin. Refinement at 2 A resolution. , 1976, The Journal of biological chemistry.

[25]  Ian J. Tickle,et al.  X-ray analysis of glucagon and its relationship to receptor binding , 1975, Nature.

[26]  M Bolognesi,et al.  Three-dimensional structure of the complex between pancreatic secretory trypsin inhibitor (Kazal type) and trypsinogen at 1.8 A resolution. Structure solution, crystallographic refinement and preliminary structural interpretation. , 1982, Journal of molecular biology.

[27]  G L Gilliland,et al.  Structure of the L-arabinose-binding protein from Escherichia coli at 2.4 A resolution. , 1980, Journal of molecular biology.

[28]  M G Rossmann,et al.  Characterization of the antigenic sites on the refined 3-A resolution structure of mouse testicular lactate dehydrogenase C4. , 1989, The Journal of biological chemistry.

[29]  D Tsernoglou,et al.  Structure and function of snake venom curarimimetic neurotoxins. , 1981, Molecular pharmacology.

[30]  Christina Thaller,et al.  Restrained least-squares refinement of the sulphydryl protease papain to 2.0 Å , 1984 .

[31]  M. Perutz,et al.  Structure of human foetal deoxyhaemoglobin. , 1977, Journal of molecular biology.

[32]  W G Hol,et al.  Structure of porcine pancreatic phospholipase A2 at 2.6 A resolution and comparison with bovine phospholipase A2. , 1983, Journal of molecular biology.

[33]  L. Sieker,et al.  Adjustment of restraints in the refinement of methemerythrin and azidomethemerythrin at 2.0 Å resolution , 1983 .

[34]  D C Rees,et al.  Refined crystal structure of carboxypeptidase A at 1.54 A resolution. , 1983, Journal of molecular biology.

[35]  M. Rossmann,et al.  The Refinement of Southern Bean Mosaic Virus in Reciprocal Space , 1984 .

[36]  K. D. Hardman,et al.  Structure of concanavalin A at 2.4-A resolution. , 1972, Biochemistry.

[37]  E. T. Adman,et al.  Structural Features of Azurin at 2.7 Å Resolution , 1981 .

[38]  J. Richardson,et al.  Determination and analysis of the 2 A-structure of copper, zinc superoxide dismutase. , 1980, Journal of molecular biology.

[39]  Graeme Wistow,et al.  X-ray analysis of the eye lens protein γ-II crystallin at 1·9 Å resolution , 1983 .

[40]  M. Rossmann,et al.  Structure of the active ternary complex of pig heart lactate dehydrogenase with S-lac-NAD at 2.7 A resolution. , 1981, Journal of molecular biology.

[41]  S J Oatley,et al.  Structure of prealbumin: secondary, tertiary and quaternary interactions determined by Fourier refinement at 1.8 A. , 1977, Journal of molecular biology.

[42]  J Deisenhofer,et al.  Crystal structure analysis and molecular model of a complex of citrate synthase with oxaloacetate and S-acetonyl-coenzyme A. , 1984, Journal of molecular biology.

[43]  D. Eisenberg,et al.  The structure of melittin. I. Structure determination and partial refinement. , 1981, The Journal of biological chemistry.

[44]  L. H. Jensen,et al.  Structural Features of Azurin at 2.7 Angstroms Resolution , 1980 .

[45]  R. Sauer,et al.  Structure of tomato bushy stunt virus. V. Coat protein sequence determination and its structural implications. , 1984, Journal of molecular biology.

[46]  George D. Rose,et al.  Prediction of chain turns in globular proteins on a hydrophobic basis , 1978, Nature.

[47]  R. Dickerson,et al.  Structure of cytochrome c551 from Pseudomonas aeruginosa refined at 1.6 A resolution and comparison of the two redox forms. , 1982, Journal of molecular biology.

[48]  T A Jones,et al.  Structure of satellite tobacco necrosis virus after crystallographic refinement at 2.5 A resolution. , 1984, Journal of molecular biology.

[49]  E N Baker,et al.  X-ray crystallographic studies of seal myoglobin. The molecule at 2.5 A resolution. , 1969, Journal of molecular biology.

[50]  B. K. Vainshtein,et al.  X-Ray Structural Investigation of Leghemoglobin. Vi. Structure of Acetate-Ferrileghemoglobin at a Resolution of 2.0 Angstroms (Russian) , 1983 .

[51]  J. Bolin,et al.  Crystal structures of Escherichia coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 A resolution. I. General features and binding of methotrexate. , 1982, The Journal of biological chemistry.

[52]  P. Karplus,et al.  Refined structure of glutathione reductase at 1.54 A resolution. , 1987, Journal of molecular biology.

[53]  A. McPherson,et al.  Refined structure of the gene 5 DNA binding protein from bacteriophage fd. , 1983, Journal of molecular biology.

[54]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[55]  P J Artymiuk,et al.  Refinement of human lysozyme at 1.5 A resolution analysis of non-bonded and hydrogen-bond interactions. , 1981, Journal of molecular biology.

[56]  Y. Satow,et al.  Solvent accessibility and microenvironment in a bacterial protein proteinase inhibitor SSI (Streptomyces subtilisin inhibitor). , 1980, Journal of biochemistry.

[57]  P. Karplus,et al.  Refined structure of porcine cytosolic adenylate kinase at 2.1 A resolution. , 1988, Journal of molecular biology.

[58]  W. Steigemann,et al.  Structure of erythrocruorin in different ligand states refined at 1.4 A resolution. , 1979, Journal of molecular biology.

[59]  W. Bode,et al.  Refined 2.5 A X-ray crystal structure of the complex formed by porcine kallikrein A and the bovine pancreatic trypsin inhibitor. Crystallization, Patterson search, structure determination, refinement, structure and comparison with its components and with the bovine trypsin-pancreatic trypsin inhibit , 1983, Journal of molecular biology.

[60]  E. L. Amma,et al.  Macromolecular structure refinement by restrained least‐squares and interactive graphics as applied to sickling deer type III hemoglobin , 1979 .

[61]  B. Wang,et al.  Crystal structure of Cd,Zn metallothionein. , 1985, Science.

[62]  W A Hendrickson,et al.  Refinement of a molecular model for lamprey hemoglobin from Petromyzon marinus. , 1985, Journal of molecular biology.

[63]  M. F. PERUTZ,et al.  Three Dimensional Fourier Synthesis of Horse Deoxyhaemoglobin at 2.8 Å Resolution , 1970, Nature.

[64]  Frederic M. Richards,et al.  Packing of α-helices: Geometrical constraints and contact areas☆ , 1978 .

[65]  G. Cohen,et al.  Refined crystal structure of gamma-chymotrypsin at 1.9 A resolution. Comparison with other pancreatic serine proteases. , 1981, Journal of molecular biology.

[66]  C. Chothia,et al.  Orthogonal packing of beta-pleated sheets in proteins. , 1982, Biochemistry.

[67]  Masao Kakudo,et al.  X-Ray Analysis of a [2Fe-2S] Ferredoxin from ‘Spirulina platensis. Main Chain Fold and Location of Side Chains at 2.5 Å Resolution , 1981 .

[68]  N. Yasuoka,et al.  Refined structure of cytochrome c3 at 1.8 A resolution. , 1984, Journal of molecular biology.

[69]  W. Hol,et al.  Structure of bovine pancreatic phospholipase A2 at 1.7A resolution. , 1981, Journal of molecular biology.

[70]  Y. Hata,et al.  Structure of rice ferricytochrome c at 2.0 A resolution. , 1983, Journal of molecular biology.

[71]  J Moult,et al.  Electron density calculations as an extension of protein structure refinement. Streptomyces griseus protease A at 1.5 A resolution. , 1983, Journal of molecular biology.

[72]  G Deléage,et al.  An algorithm for protein secondary structure prediction based on class prediction. , 1987, Protein engineering.

[73]  P. Y. Chou,et al.  Conformational parameters for amino acids in helical, beta-sheet, and random coil regions calculated from proteins. , 1974, Biochemistry.

[74]  R. Huber,et al.  Human alpha 1-proteinase inhibitor. Crystal structure analysis of two crystal modifications, molecular model and preliminary analysis of the implications for function. , 1984, Journal of molecular biology.

[75]  Robert Huber,et al.  Structure of bovine pancreatic trypsin inhibitor , 1984 .

[76]  J. M. Thornton,et al.  Prediction of super-secondary structure in proteins , 1983, Nature.

[77]  V. Lim Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. , 1974, Journal of molecular biology.

[78]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[79]  W. B. Church,et al.  The crystal structure of mercury-substituted poplar plastocyanin at 1.9-A resolution. , 1986, The Journal of biological chemistry.

[80]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[81]  P. Wolynes,et al.  Toward Protein Tertiary Structure Recognition by Means of Associative Memory Hamiltonians , 1989, Science.

[82]  M. James,et al.  Structure and refinement of penicillopepsin at 1.8 A resolution. , 1983, Journal of molecular biology.

[83]  B. Finzel,et al.  Structure of ferricytochrome c' from Rhodospirillum molischianum at 1.67 A resolution. , 1985, Journal of molecular biology.

[84]  T A Jones,et al.  Structure, Refinement, and Function of Carbonic Anhydrase Isozymes: Refinement of Human Carbonic Anhydrase I , 1984, Annals of the New York Academy of Sciences.

[85]  J. Deisenhofer Crystallographic refinement and atomic models of a human Fc fragment and its complex with fragment B of protein A from Staphylococcus aureus at 2.9- and 2.8-A resolution. , 1981, Biochemistry.

[86]  Y. Matsuura,et al.  Structure and possible catalytic residues of Taka-amylase A. , 1982, Journal of biochemistry.

[87]  D C Carter,et al.  Crystal structure of Azotobacter cytochrome c5 at 2.5 A resolution. , 1985, Journal of molecular biology.

[88]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[89]  A Wendel,et al.  The refined structure of the selenoenzyme glutathione peroxidase at 0.2-nm resolution. , 1983, European journal of biochemistry.

[90]  B. Matthews,et al.  Structure of thermolysin refined at 1.6 A resolution. , 1982, Journal of molecular biology.

[91]  R. M. Burnett,et al.  Structure of the semiquinone form of flavodoxin from Clostridum MP. Extension of 1.8 A resolution and some comparisons with the oxidized state. , 1978, Journal of molecular biology.

[92]  C. Stout,et al.  Refinement of the 7 Fe ferredoxin from Azotobacter vinelandii at 1.9 A resolution. , 1989, Journal of molecular biology.

[93]  R. Kretsinger,et al.  Refinement of the structure of carp muscle calcium-binding parvalbumin by model building and difference Fourier analysis. , 1976, Journal of molecular biology.

[94]  Christina Thaller,et al.  Restrained Least-Squares Refinement of the Sulfhydryl Protease Papain to 2.0 Angstroms , 1985 .

[95]  K H Kim,et al.  Structural asymmetry in the CTP-liganded form of aspartate carbamoyltransferase from Escherichia coli. , 1987, Journal of molecular biology.

[96]  Wayne A. Hendrickson,et al.  Structure of the hydrophobic protein crambin determined directly from the anomalous scattering of sulphur , 1981, Nature.

[97]  S. Walter Englander,et al.  Structural characterization of folding intermediates in cytochrome c by H-exchange labelling and proton NMR , 1988, Nature.

[98]  D W Banner,et al.  Atomic coordinates for triose phosphate isomerase from chicken muscle. , 1976, Biochemical and biophysical research communications.

[99]  E. Baker,et al.  Crystallographic refinement of the structure of actinidin at 1.7 Å resolution by fast Fourier least‐squares methods , 1980 .

[100]  W G Hol,et al.  Structure of bovine liver rhodanese. I. Structure determination at 2.5 A resolution and a comparison of the conformation and sequence of its two domains. , 1978, Journal of Molecular Biology.

[101]  C. Chothia,et al.  Structural patterns in globular proteins , 1976, Nature.

[102]  L. M. Amzel,et al.  MOLECULAR-REPLACEMENT STRUCTURE OF GUINEA PIG IGG1 P*FC(PRIME) REFINED AT 3.1 ANGSTROMS RESOLUTION , 1982 .

[103]  E. G Arutiunian,et al.  X-Ray Diffraction Study of Inorganic Pyrophosphatase from Baker,S Yeast at the 3 Angstroms Resolution (Russian) , 1983 .

[104]  R J Fletterick,et al.  Secondary structure assignment for alpha/beta proteins by a combinatorial approach. , 1983, Biochemistry.

[105]  I D Kuntz,et al.  Amino acid composition and hydrophobicity patterns of protein domains correlate with their structures , 1985, Biopolymers.

[106]  E N Baker,et al.  Structure of azurin from Alcaligenes denitrificans refinement at 1.8 A resolution and comparison of the two crystallographically independent molecules. , 1987, Journal of molecular biology.

[107]  R J Read,et al.  Structure of the complex of Streptomyces griseus protease B and the third domain of the turkey ovomucoid inhibitor at 1.8-A resolution. , 1983, Biochemistry.

[108]  B. Matthews,et al.  Structure of bacteriophage T4 lysozyme refined at 1.7 A resolution. , 1987, Journal of molecular biology.

[109]  Michel Frey,et al.  Crystal structure and electron transfer properties of cytochrome c3. , 1985, The Journal of biological chemistry.

[110]  M. Perutz,et al.  The crystal structure of human deoxyhaemoglobin at 1.74 A resolution. , 1984, Journal of molecular biology.

[111]  Robert L. Baldwin,et al.  NMR evidence for an early framework intermediate on the folding pathway of ribonuclease A , 1988, Nature.

[112]  Shoshana J. Wodak,et al.  Identification of predictive sequence motifs limited by protein structure data base size , 1988, Nature.

[113]  S. Phillips,et al.  Structure and refinement of oxymyoglobin at 1.6 A resolution. , 1980, Journal of molecular biology.