Structural superposition of proteins with unknown alignment and detection of topological similarity using a six‐dimensional search algorithm

An algorithm for the rigid‐body superposition of proteins is described and tested. No prior knowledge of equivalent residues is required. To find the common structural core of two proteins, an exhaustive grid search is conducted in three‐dimensional angle space, and at each grid point a fast translation search in three‐dimensional space is performed. The best superposition at a given angle set is defined by that translation vector which maximizes the weighted number of equivalent Cα atoms. Filters using the information about the sequential character of the polypeptide chain are employed to identify that rotation and translation which yields the highest topological similarity of the two proteins. The algorithm is shown to find the best superposition of distantly related structures, and to be capable of finding similar structures to a given atomic model in the Brookhaven Protein Data Bank. In a search using granulocyte‐macrophage colony‐stimulating factor as a template, all other four‐helix bundle cytokines with up‐up‐down‐down topology were found to give the highest values of a topological similarity score, followed by interferon‐β and ‐γ and those four‐helix bundles with the more common up‐down‐up‐down topology. In another example, the insertion domain of the long variant adenylate kinases is demonstrated to share its fold with rubredoxin. © 1995 Wiley‐Liss, Inc.

[1]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[2]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[3]  Eaton E. Lattman,et al.  Optimal sampling of the rotation function , 1972 .

[4]  D C Richardson,et al.  Similarity of three-dimensional structure between the immunoglobulin domain and the copper, zinc superoxide dismutase subunit. , 1976, Journal of molecular biology.

[5]  P Argos,et al.  Exploring structural homology of proteins. , 1976, Journal of molecular biology.

[6]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1978, Archives of biochemistry and biophysics.

[7]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[8]  A. Mclachlan Gene duplications in the structural evolution of chymotrypsin. , 1979, Journal of molecular biology.

[9]  Wayne A. Hendrickson,et al.  Transformations to optimize the superposition of similar structures , 1979 .

[10]  R. Diamond A note on the rotational superposition problem , 1988 .

[11]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[12]  D. Tsernoglou,et al.  Structure of the membrane-pore-forming fragment of colicin A , 1989, Nature.

[13]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[14]  K. Diederichs,et al.  Three-dimensional structure of the complex between the mitochondrial matrix adenylate kinase and its substrate AMP. , 1990, Biochemistry.

[15]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[16]  K. Diederichs,et al.  Novel fold and putative receptor binding site of granulocyte-macrophage colony-stimulating factor. , 1991, Science.

[17]  J. Zou,et al.  Improved methods for building protein models in electron density maps and the location of errors in these models. , 1991, Acta crystallographica. Section A, Foundations of crystallography.

[18]  S E Ealick,et al.  Three-dimensional structure of recombinant human interferon-gamma. , 1991, Science.

[19]  W R Taylor,et al.  Fast structure alignment for protein databank searching , 1992, Proteins.

[20]  P Glaser,et al.  Zinc, a novel structural element found in the family of bacterial adenylate kinases. , 1992, Biochemistry.

[21]  G. Schulz,et al.  Induced-fit movements in adenylate kinases. , 1990, Faraday discussions.

[22]  U. Hobohm,et al.  Selection of representative protein data sets , 1992, Protein science : a publication of the Protein Society.

[23]  D Eisenberg,et al.  The structure of granulocyte-colony-stimulating factor and its relationship to other growth factors. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[24]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[25]  C Sander,et al.  Structural alignment of globins, phycocyanins and colicin A , 1993, FEBS letters.

[26]  Chris Sander,et al.  Globin fold in a bacterial toxin , 1993, Nature.

[27]  M. Lambert,et al.  A novel dimer configuration revealed by the crystal structure at 2.4 Å resolution of human interleukin-5 , 1993, Nature.

[28]  X. Xu,et al.  Defining topological equivalences in protein structures by means of a dynamic programming algorithm. , 1993, Protein engineering.

[29]  H. Mantsch,et al.  Zinc chelation and structural stability of adenylate kinase from Bacillus subtilis. , 1994, Biochemistry.

[30]  U. Hobohm,et al.  Enlarged representative set of protein structures , 1994, Protein science : a publication of the Protein Society.

[31]  K. Diederichs SUPERIMPOSE– a program for the unambiguous structural superposition of spatially related molecules, including macromolecules , 1994 .

[32]  D. Stuart,et al.  The crystal structure and biological function of leukemia inhibitory factor: Implications for receptor binding , 1994, Cell.

[33]  A Wlodawer,et al.  Structural comparisons among the short-chain helical cytokines. , 1994, Structure.

[34]  A C May,et al.  Protein structure comparisons using a combination of a genetic algorithm, dynamic programming and least-squares minimization. , 1994, Protein engineering.

[35]  C. Sander,et al.  Searching protein structure databases has come of age , 1994, Proteins.

[36]  D Cvijovicacute,et al.  Taboo search: an approach to the multiple minima problem. , 1995, Science.