Structural relationships of homologous proteins as a fundamental principle in homology modeling

Protein structure prediction is based mainly on the modeling of proteins by homology to known structures; this knowledgebased approach is the most promising method to date. Although it is used in the whole area of protein research, no general rules concerning the quality and applicability of concepts and procedures used in homology modeling have been put forward yet. Therefore, the main goal of the present work is to provide tools for the assessment of accuracy of modeling at a given level of sequence homology. A large set of known structures from different conformational and functional classes, but various degrees of homology was selected. Pairwise structure superpositions were performed. Starting with the definition of the structurally conserved regions and determination of topologically correct sequence alignments, we correlated geometrical properties with sequence homology (defined by the 250 PAM Dayhoff Matrix) and identity. It is shown that both the topological differences of the protein backbones and the relative positions of corresponding side chains diverge with decreasing sequence identity. Below 50% identity, the deviation in regions that are structurally not conserved continually increases, thus implying that with decreasing sequence identity modeling has to take into account more and more structurally diverging loop regions that are difficult to predict. © 1993 Wiley‐Liss, Inc.

[1]  C. Levinthal Are there pathways for protein folding , 1968 .

[2]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[3]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[4]  M. Sternberg,et al.  On the prediction of protein structure: The significance of the root-mean-square deviation. , 1980, Journal of molecular biology.

[5]  M. Rossmann,et al.  Structure of the active ternary complex of pig heart lactate dehydrogenase with S-lac-NAD at 2.7 A resolution. , 1981, Journal of molecular biology.

[6]  W G Hol,et al.  Structure of porcine pancreatic phospholipase A2 at 2.6 A resolution and comparison with bovine phospholipase A2. , 1983, Journal of molecular biology.

[7]  M Levitt,et al.  Alignment of the amino acid sequences of distantly related proteins using variable gap penalties. , 1986, Protein engineering.

[8]  J. Moult,et al.  An algorithm for determining the conformation of polypeptide segments in proteins by systematic search , 1986, Proteins.

[9]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[10]  T. Blundell,et al.  Knowledge based modelling of homologous proteins, Part I: Three-dimensional frameworks derived from the simultaneous superposition of multiple structures. , 1987, Protein engineering.

[11]  T. L. Blundell,et al.  Knowledge-based prediction of protein structures and the design of novel molecules , 1987, Nature.

[12]  J. Ponder,et al.  Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. , 1987, Journal of molecular biology.

[13]  R. Jaenicke Is There a Code for Protein Folding , 1988 .

[14]  Z Otwinowski,et al.  Flexibility of the DNA‐binding domains of trp repressor , 1988, Proteins.

[15]  Shoshana J. Wodak,et al.  Identification of predictive sequence motifs limited by protein structure data base size , 1988, Nature.

[16]  H. Frauenfelder,et al.  Conformational substates in proteins. , 1988, Annual review of biophysics and biophysical chemistry.

[17]  John P. Overington,et al.  Knowledge‐based protein modelling and design , 1988 .

[18]  John Moult,et al.  Comparative Modeling of Protein Structure—Progress and Prospects , 1989, Journal of research of the National Institute of Standards and Technology.

[19]  R. Kaptein,et al.  1H NMR studies of bovine and porcine phospholipase A2: assignment of aromatic resonances and evidence for a conformational equilibrium in solution. , 1989, Biochemistry.

[20]  S. Suhai MODELLING OF PROTEIN STRUCTURES ON THE BASIS OF SEQUENCE DATA , 1990 .

[21]  John P. Overington,et al.  From comparisons of protein sequences and structures to protein modelling and design. , 1990, Trends in biochemical sciences.

[22]  S. Styring,et al.  Structure of donor side components in photosystem II predicted by computer modelling. , 1990, The EMBO journal.

[23]  Barry Robson,et al.  Comparison of the X-ray structure of baboon α-lactalbumin and the tertiary predicted computer models of human α-lactalbumin , 1990, J. Comput. Aided Mol. Des..

[24]  I. Weber,et al.  Evaluation of homology modeling of HIV Protease , 1990, Proteins.

[25]  David R. Evans,et al.  Comparative modeling of mammalian aspartate transcarbamylase , 1991, Proteins.

[26]  T. Tanimoto,et al.  Ridges, hotspots and their interaction as observed in seismic velocity maps , 1992, Nature.

[27]  R. Jaenicke,et al.  What does protein refolding in vitro tell us about protein folding in the cell? , 1993, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.