The meaning of alignment: lessons from structural diversity

BackgroundProtein structural alignment provides a fundamental basis for deriving principles of functional and evolutionary relationships. It is routinely used for structural classification and functional characterization of proteins and for the construction of sequence alignment benchmarks. However, the available techniques do not fully consider the implications of protein structural diversity and typically generate a single alignment between sequences.ResultsWe have taken alternative protein crystal structures and generated simulation snapshots to explicitly investigate the impact of structural changes on the alignments. We show that structural diversity has a significant effect on structural alignment. Moreover, we observe alignment inconsistencies even for modest spatial divergence, implying that the biological interpretation of alignments is less straightforward than commonly assumed. A salient example is the GroES 'mobile loop' where sub-Ångstrom variations give rise to contradictory sequence alignments.ConclusionA comprehensive treatment of ambiguous alignment regions is crucial for further development of structural alignment applications and for the representation of alignments in general. For this purpose we have developed an on-line database containing our data and new ways of visualizing alignment inconsistencies, which can be found at http://www.ibi.vu.nl/databases/stralivari.

[1]  A. Horwich,et al.  The crystal structure of the asymmetric GroEL–GroES–(ADP)7 chaperonin complex , 1997, Nature.

[2]  M. Suchard,et al.  Alignment Uncertainty and Genomic Analysis , 2008, Science.

[3]  A. Mark,et al.  Fluctuation and cross-correlation analysis of protein motions observed in nanosecond molecular dynamics simulations. , 1995, Journal of molecular biology.

[4]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[5]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[6]  Berk Hess,et al.  GROMACS 3.0: a package for molecular simulation and trajectory analysis , 2001 .

[7]  J. Maizel,et al.  Enhanced graphic matrix analysis of nucleic acid and protein sequences. , 1981, Proceedings of the National Academy of Sciences of the United States of America.

[8]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[9]  Peter Lackner,et al.  Comparative Analysis of Protein Structure Alignments , 2007, BMC Structural Biology.

[10]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[11]  Antonis Rokas Lining Up to Avoid Bias , 2008, Science.

[12]  A. D. McLachlan,et al.  Rapid comparison of protein structures , 1982 .

[13]  Ruth Nussinov,et al.  A method for simultaneous alignment of multiple protein structures , 2004, Proteins.

[14]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[15]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[16]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[17]  Alejandra Leo-Macias,et al.  A new progressive-iterative algorithm for multiple structure alignment , 2005, Bioinform..

[18]  Geoffrey J. Barton,et al.  The Jalview Java alignment editor , 2004, Bioinform..

[19]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[20]  C. Sander,et al.  A database of protein structure families with common folding motifs , 1992, Protein science : a publication of the Protein Society.

[21]  Ramanathan Sowdhamini,et al.  BMC Bioinformatics BioMed Central Database , 2004 .

[22]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[23]  M. Zuker Suboptimal sequence alignment in molecular biology. Alignment with error analysis. , 1991, Journal of molecular biology.

[24]  Lode Wyns,et al.  SABmark- a benchmark for sequence alignment that covers the entire known fold space , 2005, Bioinform..

[25]  Cédric Notredame,et al.  Recent Evolutions of Multiple Sequence Alignment Algorithms , 2007, PLoS Comput. Biol..

[26]  A. Godzik The structural alignment between two proteins: Is there a unique answer? , 1996, Protein science : a publication of the Protein Society.

[27]  T. Littlejohn,et al.  Swiss-PDB Viewer (Deep View). , 2001, Briefings in bioinformatics.

[28]  G M Crippen,et al.  Size‐independent comparison of protein three‐dimensional structures , 1995, Proteins.

[29]  Roberto Mosca,et al.  RAPIDO: a web server for the alignment of protein structures in the presence of conformational changes , 2008, Nucleic Acids Res..

[30]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[31]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[32]  M Levitt,et al.  Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins , 1998, Protein science : a publication of the Protein Society.

[33]  Lenore Cowen,et al.  Matt: Local Flexibility Aids Protein Multiple Structure Alignment , 2008, PLoS Comput. Biol..