An Algorithm for Statistical Alignment of Sequences Related by a Binary Tree

An algorithm is presented that allows the calculation of the probability of a set of sequences related by a binary tree that has evolved according to the Thorne-Kishino-Felsenstein model (1991) for a fixed set of parameters. There are two ideas underlying this algorithm. Firstly, a markov chain is defined that generates ancestral sequences and their alignment at two neighboring nodes in a tree. Secondly, a stochastic walk on the binary tree, that defines a markov chain generating ancestral sequences and their alignment at the internal nodes in the tree is described. The running time of this algorithm is O(l2 kappa), where l is the geometric average of the sequence lengths and kappa the number of sequences--leaves at the binary tree. This could be improved to O(l kappa).