Exact Algorithms for Computing Pairwise Alignments and 3-Medians From Structure-Annotated Sequences (Extended Abstract)

Given the problem of mutation saturation in ancient molecular sequences, there is great interest in inferring phylogenies from higher-order types of molecular data that change more slowly, such as genomic organization and the secondary and tertiary structures of ribosomal RNA and proteins. In this paper, we define edit distances based on two representations of RNA secondary structure, arc annotation and hierarchical string annotation, and give algorithms for computing these distances on pairs of annotated sequences, aligning pairs of annotated sequences, and computing 3-median annotated sequences from triples of annotated sequences. The 3-median algorithms can be used as part of a well-known iterative heuristic for inferring phylogenies. All given algorithms are adapted from algorithms for computing longest common annotated subsequences of pairs of annotated sequences.