Patterns, structures, and amino acid frequencies in structural building blocks, a protein secondary structure classification scheme

To study local structures in proteins, we previously developed an autoassociative artificial neural network (autoANN) and clustering tool to discover intrinsic features of macromolecular structures. The hidden unit activations computed by the trained autoANN are a convenient low‐dimensional encoding of the local protein backbone structure. Clustering these activation vectors results in a unique classification of protein local structural features called Structural Building Blocks (SBBs). Here we describe application of this method to a larger database of proteins, verification of the applicability of this method to structure classification, and subsequent analysis of amino acid frequencies and several commonly occurring patterns of SBBs. The SBB classification method has several interesting properties: 1) it identifies the regular secondary structures, α helix and β strand; 2) it consistently identifies other local structure features (e.g., helix caps and strand caps); 3) strong amino acid preferences are revealed at some positions in some SBBs; and 4) distinct patterns of SBBs occur in the “random coil” regions of proteins. Analysis of these patterns identifies interesting structural motifs in the protein backbone structure, indicating that SBBs can be used as “building blocks” in the analysis of protein structure. This type of pattern analysis should increase our understanding of the relationship between protein sequence and local structure, especially in the prediction of protein structures. © 1997 Wiley‐Liss, Inc.

[1]  P Argos,et al.  Amino acid distribution in protein secondary structures. , 2009, International journal of peptide and protein research.

[2]  J. Fetrow Omega loops; nonregular secondary structures significant in protein function and stability , 1995, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[3]  A A Salamov,et al.  Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. , 1995, Journal of molecular biology.

[4]  George D. Rose,et al.  Sequence determinants of the capping box, a stabilizing motif at the N‐termini of α‐helices , 1994 .

[5]  Peter S. Shenkin,et al.  Cluster analysis of molecular conformations , 1994, J. Comput. Chem..

[6]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[7]  R. Srinivasan,et al.  Rules for alpha-helix termination by glycine. , 1994, Science.

[8]  Scott R. Presnell,et al.  Origins of structural diversity within sequentially identical hexapeptides , 1993, Protein science : a publication of the Protein Society.

[9]  F. Quiocho,et al.  1.7 ANGSTROMS REFINED STRUCTURE OF SULFATE-BINDING PROTEIN INVOLVED IN ACTIVE TRANSPORT AND NOVEL MODE OF SULFATE BINDING , 1993 .

[10]  E. Lander,et al.  Protein secondary structure prediction using nearest-neighbor methods. , 1993, Journal of molecular biology.

[11]  G. Rose,et al.  Helix stop signals in proteins and peptides: the capping box. , 1993, Biochemistry.

[12]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[13]  Jacquelyn S. Fetrow,et al.  Automatic Derivation of Substructures Yields Novel Structural Building Blocks in Globular Proteins , 1993, ISMB.

[14]  N. Colloc'h,et al.  Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment. , 1993, Protein engineering.

[15]  B. Forood,et al.  Stabilization of alpha-helical structures in short peptides via end capping. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[16]  P. Lyu,et al.  Capping interactions in isolated alpha helices: position-dependent substitution effects and structure of a serine-capped peptide helix. , 1993, Biochemistry.

[17]  S J Prestrelski,et al.  Generation of a substructure library for the description and classification of protein secondary structure. I. Overview of the methods and results , 1992, Proteins.

[18]  T. Salakoski,et al.  Selection of a representative set of structures from brookhaven protein data bank , 1992, Proteins.

[19]  L Serrano,et al.  Alpha-helix stability in proteins. I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. , 1992, Journal of molecular biology.

[20]  A. Fersht,et al.  α-Helix stability in proteins , 1992 .

[21]  J. Mesirov,et al.  Hybrid system for protein secondary structure prediction. , 1992, Journal of molecular biology.

[22]  P. Lyu,et al.  Position-dependent stabilizing effects in .alpha.-helices: N-terminal capping in synthetic model peptides , 1992 .

[23]  F. Cohen,et al.  Taxonomy and conformational analysis of loops in proteins. , 1992, Journal of molecular biology.

[24]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[25]  J. Lecomte,et al.  Helix formation in apocytochrome b5: the role of a neutral histidine at the N-cap position , 1991 .

[26]  Fred E. Cohen,et al.  β-Breakers: An aperiodic secondary structure , 1991 .

[27]  P. Moews,et al.  Beta-lactamase of Bacillus licheniformis 749/C. Refinement at 2 A resolution and analysis of hydration. , 1991 .

[28]  A. Efimov,et al.  Structure of coiled β‐β‐hairpins and β‐β‐corners , 1991 .

[29]  Scott R. Presnell,et al.  A segment-based approach to protein secondary structure prediction. , 1991, Biochemistry.

[30]  L. Gierasch,et al.  Side chain–backbone hydrogen bonding contributes to helix stability in peptides derived from an α‐helical region of carboxypeptidase A , 1991, Proteins.

[31]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[32]  R Langridge,et al.  Improvements in protein secondary structure prediction by an enhanced neural network. , 1990, Journal of molecular biology.

[33]  M J Rooman,et al.  Automatic definition of recurrent local structure motifs in proteins. , 1990, Journal of molecular biology.

[34]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[35]  T. Sejnowski,et al.  Predicting the secondary structure of globular proteins using neural network models. , 1988, Journal of molecular biology.

[36]  G. Rose,et al.  Helix signals in proteins. , 1988, Science.

[37]  J. Richardson,et al.  Amino acid preferences for specific locations at the ends of alpha helices. , 1988, Science.

[38]  James L. McClelland Explorations In Parallel Distributed Processing , 1988 .

[39]  Milner-White Ej Recurring loop motif in proteins that occurs in right-handed and left-handed forms. Its relationship with alpha-helices and beta-bulge loops. , 1988 .

[40]  Janet M. Thornton,et al.  Structural and sequence patterns in the loops of βαβ units , 1987 .

[41]  Robert L. Baldwin,et al.  Tests of the helix dipole model for stabilization of α-helices , 1987, Nature.

[42]  G. Rose,et al.  Loops in globular proteins: a novel category of secondary structure. , 1986, Science.

[43]  T. A. Jones,et al.  Using known substructures in protein model building and crystallography. , 1986, The EMBO journal.

[44]  R. M. Abarbanel,et al.  Turn prediction in proteins using a pattern-matching approach. , 1986, Biochemistry.

[45]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[46]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[47]  C Sander,et al.  On the use of sequence homologies to predict protein structure: identical pentapeptides can have completely different conformations. , 1984, Proceedings of the National Academy of Sciences of the United States of America.

[48]  A. Efimov,et al.  A novel super‐secondary structure of proteins and the relation between the structure and the amino acid sequence , 1984, FEBS letters.

[49]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[50]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[51]  J. Garnier,et al.  Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. , 1978, Journal of molecular biology.

[52]  M. Levitt,et al.  Automatic identification of secondary structure in globular proteins. , 1977, Journal of molecular biology.

[53]  P. Y. Chou,et al.  Prediction of protein conformation. , 1974, Biochemistry.

[54]  If.,et al.  Stereochemical criteria for polypeptides and proteins. V. Conformation of a system of three linked peptide units , 1968, Biopolymers.

[55]  Mal'tsev Ni,et al.  A study of pepsin specificity in transpeptidation reactions , 1966 .

[56]  L. Pauling,et al.  Configuration of Polypeptide Chains , 1951, Nature.

[57]  L. Pauling,et al.  The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. , 1951, Proceedings of the National Academy of Sciences of the United States of America.

[58]  Xiru Zhang,et al.  Design of an Auto-associative Neural Network with Hidden Layer Activations that were used to Reclassify Local Protein Structures , 1994 .

[59]  A. Efimov Structure of coiled beta-beta-hairpins and beta-beta-corners. , 1991, FEBS letters.

[60]  J L Sussman,et al.  A 3D building blocks approach to analyzing and predicting structure of proteins , 1989, Proteins.

[61]  L Serrano,et al.  Capping and alpha-helix stability. , 1989, Nature.

[62]  M. Karplus,et al.  Protein secondary structure prediction with a neural network. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[63]  R. Lavery,et al.  Describing protein structure: A general algorithm yielding complete helicoidal parameters and a unique overall axis , 1989, Proteins.

[64]  R. L. Baldwin,et al.  Further studies of the helix dipole model: Effects of a free α‐NH3+ or α‐COO− group on helix stability , 1989 .

[65]  F. Richards,et al.  Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure * , 1988, Proteins.

[66]  M Bycroft,et al.  Stabilization of protein structure by interaction of alpha-helix dipole with a charged side chain. , 1988, Nature.

[67]  B. Matthews,et al.  Enhanced protein thermostability from designed mutations that interact with alpha-helix dipoles. , 1988, Nature.

[68]  E. Milner-White Recurring loop motif in proteins that occurs in right-handed and left-handed forms. Its relationship with alpha-helices and beta-bulge loops. , 1988, Journal of molecular biology.

[69]  Ron Poet,et al.  Loops, bulges, turns and hairpins in proteins , 1987 .

[70]  D. Rumelhart Learning Internal Representations by Error Propagation, Parallel Distributed Processing , 1986 .

[71]  G. Rose,et al.  Turns in peptides and proteins. , 1985, Advances in protein chemistry.

[72]  E. Baker,et al.  Hydrogen bonding in globular proteins. , 1984, Progress in biophysics and molecular biology.

[73]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[74]  H. Scheraga,et al.  Conformational analysis of the 20 naturally occurring amino acid residues using ECEPP. , 1977, Macromolecules.

[75]  B. Forood,et al.  Stabilization of a-helical structures in short peptides via end capping (peptide synthesis/protein folding/circular dichroism) , 2022 .

[76]  F. C. Bernstein,et al.  \the Protein Data Bank: a Computer-based Archival Le for Macromolecular Structures," , 2022 .