论文信息 - A Linear-Time Algorithm for the 1-Mismatch Problem

A Linear-Time Algorithm for the 1-Mismatch Problem

For sequence alignments (which can be viewed simply as rectangular arrays of characters), a frequent need is to identify regions, each consisting of a run of consecutive columns, that have some particular property. The 1-mismatch problem is to locate all maximal regions in a given alignment for which there exists a (not necessarily unique) “center” sequence such that inside the region alignment rows are within Hamming distance 1 from the center. We first describe some properties of these regions and their centers, and then use these properties to construct an algorithm that for a dxn alignment runs in time θ(nd) and extra space θ(d) (beyond that needed for the storage of the alignment itself).

[1] S Schwartz,et al. Globin gene server: a prototype E-mail database server featuring extensive multiple alignments and data compilation for electronic genetic analysis. , 1994, Genomics.

[2] Raffaele Giancarlo,et al. Data structures and algorithms for approximate string matching , 1988, J. Complex..

[3] W Miller,et al. Phylogenetic footprinting of hypersensitive site 3 of the beta-globin locus control region. , 1997, Blood.

[4] W Miller,et al. The complete sequences of the galago and rabbit beta-globin locus control regions: extended sequence and functional conservation outside the cores of DNase hypersensitive sites. , 1997, Genomics.

[5] T. Heinemeyer,et al. TRANSFAC, TRRD and COMPEL: towards a federated database system on transcriptional regulation , 1997, Nucleic Acids Res..

[6] Wei Zhu,et al. Evolutionary Strategies for the Elucidation ofcisandtransFactors That Regulate the Developmental Switching Programs of the β-like Globin Genes , 1996 .