Common Intervals of Two Sequences

Looking for the subsets of genes appearing consecutively in two or more genomes is an useful approach to identify clusters of genes functionally associated. A possible formalization of this problem is to modelize the order in which the genes appear in all the considered genomes as permutations of their order in the first genome and find k-tuples of contiguous subsets of these permutations consisting of the same elements: the common intervals. A drawback of this approach is that it doesn’t allow to take into account paralog genes and genomic internal duplications (each element occurs only once in a permutation). To do it we need to modelize the order of genes by sequences which are not necessary permutations.