An investigation of conservation-biased gap-penalties for multiple protein sequence alignment.

Sequence conservation in a multiple sequence alignment (or profile) is often used to influence the alignment of further sequences onto the profile. Most methods, however, have considered only the opening of a gap at a single point and not what is contained in the inserted segment of one sequence (or profile) or what terminates the 'broken' ends of the other. An alignment algorithm is described that incorporates these aspects and the relative importance of the contribution from the insert and the 'broken' ends has been assessed. The approach was tested on families of very remotely related sequences using a novel protocol that was developed to quantify both the stability and generality of the solution.

[1]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[2]  Smith Rf,et al.  Pattern-induced multi-sequence alignment (PIMA) algorithm employing secondary structure-dependent gap penalties for use in comparative protein modelling. , 1992 .

[3]  W R Taylor,et al.  Protein structure alignment. , 1989, Journal of molecular biology.

[4]  T L Blundell,et al.  A variable gap penalty function and feature weights for protein 3-D structure comparisons. , 1992, Protein engineering.

[5]  W R Taylor,et al.  Deriving an amino acid distance matrix. , 1993, Journal of theoretical biology.

[6]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[7]  William R. Taylor,et al.  Motif-Biased Protein Sequence Alignment , 1994, J. Comput. Biol..

[8]  P. Argos,et al.  Analysis of insertions/deletions in protein structures. , 1992, Journal of molecular biology.

[9]  M Levitt,et al.  Alignment of the amino acid sequences of distantly related proteins using variable gap penalties. , 1986, Protein engineering.

[10]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[11]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[12]  J. Carey,et al.  Six new candidate members of the α/β twisted open‐sheet family detected by sequence similarity to flavodoxin , 1994, Protein science : a publication of the Protein Society.