Finding Nested Common Intervals Efficiently

In this paper, we study the problem of efficiently finding gene clusters formalized by nested common intervals between two genomes represented either as permutations or as sequences. Considering permutations, we give several algorithms whose running time depends on the size of the actual output rather than the output in the worst case. Indeed, we first provide a straightforward O (n 3) time algorithm for finding all nested common intervals. We reduce this complexity by providing an O (n 2) time algorithm computing an irredundant output. Finally, we show, by providing a third algorithm, that finding only the maximal nested common intervals can be done in linear time. Considering sequences, we provide solutions (modifications of previously defined algorithms and a new algorithm) for different variants of the problem, depending on the treatment one wants to apply to duplicated genes.

[1]  Jens Stoye,et al.  Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences , 2004, CPM.

[2]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[3]  Cedric Chauve,et al.  Formal Models of Gene Clusters , 2007 .

[4]  Guillaume Fertin,et al.  Comparing Genomes with Duplications: A Computational Complexity Point of View , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Sven Rahmann,et al.  Integer Linear Programs for Discovering Approximate Gene Clusters , 2006, WABI.

[6]  Jens Stoye,et al.  Computation of Median Gene Clusters , 2009, J. Comput. Biol..

[7]  Dannie Durand,et al.  The Incompatible Desiderata of Gene Cluster Properties , 2005, Comparative Genomics.

[8]  U Kurzik-Dumke,et al.  Identification of a novel Drosophila melanogaster gene, angel, a member of a nested gene cluster at locus 59F4,5. , 1996, Biochimica et biophysica acta.

[9]  Gad M. Landau,et al.  Using PQ Trees for Comparative Genomics , 2005, CPM.

[10]  Jens Stoye,et al.  On the Similarity of Sets of Permutations and Its Applications to Genome Comparison , 2006, J. Comput. Biol..

[11]  Xin He,et al.  Identifying Conserved Gene Clusters in the Presence of Homology Families , 2005, J. Comput. Biol..

[12]  Hon Wai Leong,et al.  Gene Team Tree: A Compact Representation of All Gene Teams , 2008, RECOMB-CG.

[13]  Takeaki Uno,et al.  Fast Algorithms to Enumerate All Common Intervals of Two Permutations , 1997, Algorithmica.

[14]  Mathieu Raffinot,et al.  The Algorithmic of Gene Teams , 2002, WABI.

[15]  Mathieu Raffinot,et al.  Computing Common Intervals of K Permutations, with Applications to Modular Decomposition of Graphs , 2005, SIAM J. Discret. Math..

[16]  Jens Stoye,et al.  Character sets of strings , 2007, J. Discrete Algorithms.