论文信息 - Parametric Recomuting in Alignment Graphs

Parametric Recomuting in Alignment Graphs

DNA/protein sequence alignments in computational molecular biology depend heavily on the settings of penalties for substitutions, insertions/deletions and gaps. Inappropriate choice of parameters causes irrelevant matches (“noise”) to be reported, thus obscuring biologically relevant matches. In practice, biologists frequently compare sequences in a few iterations, starting from a vague idea about appropriate parameters, then refining parameters to reduce noise. This procedure often helps to delineate biologically interesting similarities and to substantially reduce laborious analysis. This paper provides a computational underpinning for such iterative noise filtration in alignment graphs. Our main results assume that a preliminary “noisy” alignment, computed with reasonable but ad hoc parameters, is given; the problem is to modify the parameters to reduce noise. We present fast algorithms to refine penalty parameters and describe an application of these algorithms.

Pavel A. Pevzner | Webb Miller | Xiaoqiu Huang

[1] E. Lander,et al. Parametric sequence comparisons. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[2] M S Boguski,et al. Analysis of conserved domains and sequence motifs in cellular regulatory proteins and locus control regions using new software tools for multiple alignment and visualization. , 1992, The New biologist.

[3] Dan Gusfield,et al. Parametric optimization of sequence alignment , 1992, SODA '92.

[4] W. Miller,et al. A point of contact between computer science and molecular biology , 1994, IEEE Computational Science and Engineering.

[5] T. Smith,et al. Optimal sequence alignments. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[6] M S Waterman,et al. Sequence alignment and penalty choice. Review of concepts, case studies and implications. , 1994, Journal of molecular biology.

[7] E. Myers,et al. Sequence comparison with concave weighting functions. , 1988, Bulletin of mathematical biology.

[8] M. O. Dayhoff,et al. Establishing homologies in protein sequences. , 1983, Methods in enzymology.

[9] W. Miller,et al. A time-efficient, linear-space local similarity algorithm , 1991 .

[10] R C Hardison,et al. Software tools for analyzing pairwise alignments of long sequences. , 1991, Nucleic acids research.

[11] X. Huang,et al. An algorithm for identifying regions of a DNA sequence that satisfy a content requirement , 1994, Comput. Appl. Biosci..

[12] O. Gotoh,et al. Optimal sequence alignment allowing for long gaps. , 1990, Bulletin of mathematical biology.

[13] M. Waterman,et al. A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisons. , 1987, Journal of molecular biology.

[14] W. Miller,et al. Use of long sequence alignments to study the evolution and regulation of mammalian globin gene clusters. , 1993, Molecular biology and evolution.

[15] Kun-Mao Chao,et al. Positive and negative regulatory elements of the rabbit embryonic eglobin gene revealed by an improved multiple alignment program and functional analysis , 1993 .

[16] Michael Ian Shamos,et al. Computational geometry: an introduction , 1985 .

[17] V. V. Panjukov. Finding steady alignments: similarity and distance , 1993, Comput. Appl. Biosci..

[18] Martin Vingron,et al. A new interactive protein sequence alignment program and comparison of its results with widely used algorithms , 1989, Comput. Appl. Biosci..