FlexSnap: Flexible Non-sequential Protein Structure Alignment

BackgroundProteins have evolved subject to energetic selection pressure for stability and flexibility. Structural similarity between proteins that have gone through conformational changes can be captured effectively if flexibility is considered. Topologically unrelated proteins that preserve secondary structure packing interactions can be detected if both flexibility and Sequential permutations are considered. We propose the FlexSnap algorithm for flexible non-topological protein structural alignment.ResultsThe effectiveness of FlexSnap is demonstrated by measuring the agreement of its alignments with manually curated non-sequential structural alignments. FlexSnap showed competitive results against state-of-the-art algorithms, like DALI, SARF2, MultiProt, FlexProt, and FATCAT. Moreover on the DynDom dataset, FlexSnap reported longer alignments with smaller rmsd.ConclusionsWe have introduced FlexSnap, a greedy chaining algorithm that reports both sequential and non-sequential alignments and allows twists (hinges). We assessed the quality of the FlexSnap alignments by measuring its agreements with manually curated non-sequential alignments. On the FlexProt dataset, FlexSnap was competitive to state-of-the-art flexible alignment methods. Moreover, we demonstrated the benefits of introducing hinges by showing significant improvements in the alignments reported by FlexSnap for the structure pairs for which rigid alignment methods reported alignments with either low coverage or large rmsd.AvailabilityAn implementation of the FlexSnap algorithm will be made available online at http://www.cs.rpi.edu/~zaki/software/flexsnap.

[1]  N. Alexandrov,et al.  SARFing the PDB. , 1996, Protein engineering.

[2]  Nathan Linial,et al.  Approximate protein structural alignment in polynomial time. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[4]  Xin Yuan,et al.  Non-sequential structure-based alignments reveal topology-independent core packing arrangements in proteins , 2005, Bioinform..

[5]  Richard A. Lee,et al.  A comprehensive and non-redundant database of protein domain movements , 2005, Bioinform..

[6]  M. Milik,et al.  Common Structural Cliques: a tool for protein structure and function analysis. , 2003, Protein engineering.

[7]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[8]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[9]  H. Wolfson,et al.  Flexible protein alignment and hinge detection , 2002, Proteins.

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  W. Kabsch A solution for the best rotation to relate two sets of vectors , 1976 .

[12]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[13]  Leslie A Kuhn,et al.  HingeMaster: Normal mode hinge prediction approach and integration of complementary predictors , 2008, Proteins.

[14]  M. Gerstein,et al.  A database of macromolecular motions. , 1998, Nucleic acids research.

[15]  William R. Taylor,et al.  Structure Comparison and Structure Patterns , 2000, J. Comput. Biol..

[16]  Peter Lackner,et al.  Comparative Analysis of Protein Structure Alignments , 2007, BMC Structural Biology.

[17]  Mohammed J. Zaki,et al.  FlexSnap: Flexible Non-sequential Protein Structure Alignment , 2009, WABI.

[18]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[19]  Ruth Nussinov,et al.  HingeProt: Automated prediction of hinges in protein structures , 2008, Proteins.

[20]  K Schulten,et al.  Protein domain movements: detection of rigid domains and visualization of hinges in comparisons of atomic coordinates , 1997, Proteins.

[21]  Micha Sharir,et al.  Identification of Partially Obscured Objects in Two and Three Dimensions by Matching Noisy Characteristic Curves , 1987 .

[22]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[23]  H. Berendsen,et al.  Systematic analysis of domain motions in proteins from conformational change: New results on citrate synthase and T4 lysozyme , 1998, Proteins.

[24]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[25]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[26]  M. Levitt,et al.  Structural similarity of DNA-binding domains of bacteriophage repressors and the globin core , 1993, Current Biology.

[27]  Ruth Nussinov,et al.  A method for simultaneous alignment of multiple protein structures , 2004, Proteins.

[28]  William R. Taylor,et al.  Protein bioinformatics - an algorithmic approach to sequence and structure analysis , 2004 .

[29]  G. Schneider,et al.  Circular permutations of natural protein sequences: structural evidence. , 1997, Current opinion in structural biology.

[30]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[31]  Zhiping Weng,et al.  An avidin‐like domain that does not bind biotin is adopted for oligomerization by the extracellular mosaic protein fibropellin , 2005, Protein science : a publication of the Protein Society.

[32]  Zhiping Weng,et al.  FAST: A novel protein structure alignment algorithm , 2004, Proteins.