A novel approach to local reliability of sequence alignments

MOTIVATION The pairwise alignment of biological sequences obtained from an algorithm will in general contain both correct and incorrect parts. Hence, to allow for a valid interpretation of the alignment, the local trustworthiness of the alignment has to be quantified. RESULTS We present a novel approach that attributes a reliability index to every pair of residues, including gapped regions, in the optimal alignment of two protein sequences. The method is based on a fuzzy recast of the dynamic programming algorithm for sequence alignment in terms of mean field annealing. An extensive evaluation with structural reference alignments not only shows that the probability for a pair of residues to be correctly aligned grows consistently with increasing reliability index, but moreover demonstrates that the value of the reliability index can directly be translated into an estimate of the probability for a correct alignment.

[1]  M. Zuker Suboptimal sequence alignment in molecular biology. Alignment with error analysis. , 1991, Journal of molecular biology.

[2]  Carsten Peterson,et al.  A Potts Neuron Approach to Communication Routing , 1997, Neural Computation.

[3]  S. Miyazawa A reliable sequence alignment method based on probabilities of residue correspondences. , 1995, Protein engineering.

[4]  M. Lässig,et al.  Finite-temperature sequence alignment. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[5]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[6]  M J Sternberg,et al.  A simple method to generate non-trivial alternate alignments of protein sequences. , 1991, Journal of molecular biology.

[7]  Carsten Peterson,et al.  Neural optimization , 1998 .

[8]  Dalit Naor,et al.  On Near-Optimal Alignments of Biological Sequences , 1994, J. Comput. Biol..

[9]  G J Barton,et al.  Evaluation and improvements in the automatic alignment of protein sequences. , 1987, Protein engineering.

[10]  M S Waterman,et al.  Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[11]  P. Argos,et al.  A data bank merging related protein structures and sequences. , 1992, Protein engineering.

[12]  Kun-Mao Chao,et al.  Locating well-conserved regions within a pairwise alignment , 1993, Comput. Appl. Biosci..

[13]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[14]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[15]  G. Gonnet,et al.  Exhaustive matching of the entire protein sequence database. , 1992, Science.

[16]  M. Vingron,et al.  Quantifying the local reliability of a sequence alignment. , 1996, Protein engineering.

[17]  P. Argos,et al.  Determination of reliable regions in protein sequence alignments. , 1990, Protein engineering.

[18]  Carsten Peterson,et al.  A New Method for Mapping Optimization Problems Onto Neural Networks , 1989, Int. J. Neural Syst..