The ups and downs of protein topology; rapid comparison of protein structure.

Protein topology can be described at different levels. At the most fundamental level, it is a sequence of secondary structure elements (a "primary topology string"). Searching predicted primary topology strings against a library of strings from known protein structures is the basis of some protein fold recognition methods. Here a method known as TOPSCAN is presented for rapid comparison of protein structures. Rather than a simple two-letter alphabet (encoding strand and helix), more complex alphabets are used encoding direction, proximity, accessibility and length of secondary elements and loops in addition to secondary structure. Comparisons are made between the structural information content of primary topology strings and encodings which contain additional information ("secondary topology strings"). The algorithm is extremely fast, with a scan of a large domain against a library of more than 2000 secondary structure strings completing in approximately 30 s. Analysis of protein fold similarity using TOPSCAN at primary and secondary topology levels is presented.

[1]  M Wilmanns,et al.  Three-dimensional structure of the bifunctional enzyme phosphoribosylanthranilate isomerase: indoleglycerolphosphate synthase from Escherichia coli refined at 2.0 A resolution. , 1992, Journal of molecular biology.

[2]  Anders Liljas,et al.  Crystal structure of catechol O-methyltransferase , 1994, Nature.

[3]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[4]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[5]  P. Kraulis A program to produce both detailed and schematic plots of protein structures , 1991 .

[6]  F M Poulsen,et al.  Three-dimensional structure of the complex between acyl-coenzyme A binding protein and palmitoyl-coenzyme A. , 1993, Journal of molecular biology.

[7]  Adel Said Elmaghraby,et al.  Is it better to combine predictions? , 2000, Protein engineering.

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  B. Rost,et al.  Protein fold recognition by prediction-based threading. , 1997, Journal of molecular biology.

[10]  D Job,et al.  The crystal structure of plant acetohydroxy acid isomeroreductase complexed with NADPH, two magnesium ions and a herbicidal transition state analog determined at 1.65 Å resolution , 1997, The EMBO journal.

[11]  D. T. Jones,et al.  A new approach to protein fold recognition , 1992, Nature.

[12]  G. Barton,et al.  Protein fold recognition by mapping predicted secondary structures. , 1996, Journal of molecular biology.

[13]  George D. Rose,et al.  A protein taxonomy based on secondary structure , 1999, Nature Structural Biology.

[14]  G. Kleywegt,et al.  Detecting folding motifs and similarities in protein structures. , 1997, Methods in enzymology.

[15]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[16]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[17]  P. Karplus,et al.  Crystal structure of the catalytic domain of a thermophilic endocellulase. , 1993, Biochemistry.

[18]  D T Jones,et al.  Classifying a protein in the CATH database of domain structures. , 1998, Acta crystallographica. Section D, Biological crystallography.

[19]  Irene T. Weber,et al.  The structure of the E. coli recA protein monomer and polymer , 1992, Nature.

[20]  A. Liljas,et al.  The structure of elongation factor G in complex with GDP: conformational flexibility and nucleotide exchange. , 1996, Structure.

[21]  Frank Eisenhaber,et al.  Improved strategy in analytic surface calculation for molecular systems: Handling of singularities and computational efficiency , 1993, J. Comput. Chem..

[22]  Chris Sander,et al.  The double cubic lattice method: Efficient approaches to numerical integration of surface area and volume and to dot surface contouring of molecular assemblies , 1995, J. Comput. Chem..

[23]  P. Argos,et al.  Knowledge‐based protein secondary structure assignment , 1995, Proteins.

[24]  F A Quiocho,et al.  An unlikely sugar substrate site in the 1.65 A structure of the human aldose reductase holoenzyme implicated in diabetic complications. , 1992, Science.

[25]  Guoguang Lu,et al.  TOP: a new method for protein structure comparisons and similarity searches , 2000 .

[26]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[27]  David R. Gilbert,et al.  Motif-based searching in TOPS protein topology databases , 1999, Bioinform..

[28]  B. Finzel,et al.  Structure of ferricytochrome c' from Rhodospirillum molischianum at 1.67 A resolution. , 1985, Journal of molecular biology.