Parsimonious Reconstruction of Sequence Evolution and Haplotype Blocks

Under the infinite-sites model of mutation, we consider the problem of finding the minimum number of recombination events which must have occurred in the evolutionary history of sampled DNA sequences. Our approach is deterministic and is based on the combinatorics of leaf-labelled rooted trees. In contrast to previously known approaches, which only yield estimated lower bounds, our approach always gives the exact minimum number of recombination events. Furthermore, our method can be used to reconstruct explicitly evolutionary histories with the minimum number of recombination events. As an additional application, we discuss how our work can be used to define haplotype blocks.