Flexible Secondary Structure Based Protein Structure Comparison Applied to the Detection of Circular Permutation

We present a novel method for structural comparison of protein structures. The approach consists of two main phases: 1) an initial search phase where, starting from aligned pairs of secondary structure elements, the space of 3D transformations is searched for similarities and 2) a subsequent refinement phase where interim solutions are subjected to parallel, local, iterative dynamic programming in the areas of possible improvement. The proposed method combines dynamic programming for finding alignments but does not restrict solutions to be sequential. In addition, to deal with the problem of nonuniqueness of optimal similarities, we introduce a consensus scoring method in selecting the preferred similarity and provide a list of top-ranked solutions. The method, called FASE (flexible alignment of secondary structure elements), was tested on well-known data and various standard problems from the literature. The results show that FASE is able to find remote and weak similarities consistently using a reasonable run time. The method was tested (using the SCOP database) on its ability to discriminate interfold pairs from intrafold pairs at the level of the best existing methods. The method was then applied to the problem of finding circular permutations in proteins.

[1]  S. Pongor,et al.  Protein fold similarity estimated by a probabilistic approach based on Cα-Cα distance comparison , 2002 .

[2]  R. Russell,et al.  Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution. , 1998, Journal of molecular biology.

[3]  Jon M. Kleinberg,et al.  Fast Detection of Common Geometric Substructure in Proteins , 1999, J. Comput. Biol..

[4]  F. Cohen,et al.  A surface of minimum area metric for the structural comparison of proteins. , 1996, Journal of molecular biology.

[5]  Ruth Nussinov,et al.  MASS: multiple structural alignment by secondary structures , 2003, ISMB.

[6]  Jukka V. Lehtonen,et al.  Finding local structural similarities among families of unrelated protein structures: A generic non‐linear alignment algorithm , 1999, Proteins.

[7]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[8]  W. Taylor Protein structure comparison using iterated double dynamic programming , 2008, Protein science : a publication of the Protein Society.

[9]  Eaton E Lattman,et al.  Seventh Meeting on the Critical Assessment of Techniques for Protein Structure Prediction , 2007, Proteins.

[10]  B. Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments. , 2000, Journal of molecular biology.

[11]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[12]  E. Lattman,et al.  Fifth Meeting on the Critical Assessment of Techniques for Protein Structure Prediction , 2022 .

[13]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[14]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[15]  P. Bork,et al.  Homology among (βα) 8 barrels: implications for the evolution of metabolic pathways 1 1Edited by G. Von Heijne , 2000 .

[16]  Osvaldo Olmea,et al.  MAMMOTH (Matching molecular models obtained from theory): An automated method for model comparison , 2002, Protein science : a publication of the Protein Society.

[17]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[18]  N. Alexandrov,et al.  SARFing the PDB. , 1996, Protein engineering.

[19]  Albert Jeltsch,et al.  Circular Permutations in the Molecular Evolution of DNA Methyltransferases , 1999, Journal of Molecular Evolution.

[20]  M. Levitt,et al.  Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.

[21]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[22]  Douglas L. Brutlag,et al.  Hierarchical Protein Structure Superposition Using Both Secondary Structure and Atomic Representations , 1997, ISMB.

[23]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[24]  G. Schneider,et al.  Circular permutations of natural protein sequences: structural evidence. , 1997, Current opinion in structural biology.

[25]  D Fischer,et al.  A computer vision based technique for 3-D sequence-independent structural comparison of proteins. , 1993, Protein engineering.

[26]  Guoguang Lu,et al.  TOP: a new method for protein structure comparisons and similarity searches , 2000 .

[27]  Slawomir K. Grzechnik,et al.  Crystal structure of a transcription regulator (TM1602) from Thermotoga maritima at 2.3 Å resolution , 2007, Proteins.

[28]  D. Brutlag,et al.  FoldMiner: Structural motif discovery using an improved superposition algorithm , 2004, Protein science : a publication of the Protein Society.

[29]  P. Bork,et al.  Homology among (betaalpha)(8) barrels: implications for the evolution of metabolic pathways. , 2000, Journal of molecular biology.

[30]  W. Pearson,et al.  Sensitivity and selectivity in protein structure comparison , 2004, Protein science : a publication of the Protein Society.

[31]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. II. On the relationship between sequence and structural similarity for proteins that are not obviously related in sequence. , 2000, Journal of molecular biology.

[32]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[33]  James E. Bray,et al.  Assigning genomic sequences to CATH , 2000, Nucleic Acids Res..

[34]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[35]  Frances M. G. Pearl,et al.  Recognizing the fold of a protein structure , 2003, Bioinform..

[36]  B Honig,et al.  An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. , 2000, Journal of molecular biology.

[37]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[38]  Alberto Caprara,et al.  Structural alignment of large—size proteins via lagrangian relaxation , 2002, RECOMB '02.

[39]  P Willett,et al.  Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[40]  N. Grishin Fold change in evolution of protein structures. , 2001, Journal of structural biology.

[41]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[42]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[43]  P. Koehl,et al.  Protein structure similarities. , 2001, Current opinion in structural biology.

[44]  William R Taylor,et al.  Evolutionary transitions in protein fold space. , 2007, Current opinion in structural biology.

[45]  Patrice Koehl,et al.  The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[46]  J. Szustakowski,et al.  Protein structure alignment using a genetic algorithm , 2000, Proteins.

[47]  H. Wolfson,et al.  Multiple structural alignment by secondary structures: Algorithm and applications , 2003, Protein science : a publication of the Protein Society.

[48]  Nick V. Grishin,et al.  Structural drift: a possible path to protein fold change , 2005, Bioinform..

[49]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[50]  William R Taylor,et al.  A structural pattern‐based method for protein fold recognition , 2004, Proteins.

[51]  M Levitt,et al.  Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins , 1998, Protein science : a publication of the Protein Society.