Computation of Median Gene Clusters

Whole genome comparison based on gene order has become a popular approach in comparative genomics. An important task in this field is the detection of gene clusters, i.e. sets of genes that occur colocalized in several genomes. For most applications it is preferable to extend this definition to allow for small deviations in the gene content of the cluster occurrences. However, relaxing the equality constraint increases the computational complexity of gene cluster detection drastically. Existing approaches deal with this problem by using simplifying constraints on the cluster definition and/or allowing only pairwise genome comparison. In this paper we introduce a cluster concept named median gene clusters that improves over existing models and present efficient algorithms for their computation that allow for the detection of approximate gene clusters in multiple genomes.

[1]  Amihood Amir,et al.  Improved approximate common interval , 2007, Inf. Process. Lett..

[2]  Jens Stoye,et al.  On Common Intervals with Errors , 2006 .

[3]  Takeaki Uno,et al.  Fast Algorithms to Enumerate All Common Intervals of Two Permutations , 1997, Algorithmica.

[4]  Jens Stoye,et al.  Algorithms for Finding Gene Clusters , 2001, WABI.

[5]  Jens Stoye,et al.  Computation of Median Gene Clusters , 2009, J. Comput. Biol..

[6]  Jens Stoye,et al.  Gecko and GhostFam: rigorous and efficient gene cluster detection in prokaryotic genomes. , 2007, Methods in molecular biology.

[7]  Sven Rahmann,et al.  Integer Linear Programs for Discovering Approximate Gene Clusters , 2006, WABI.

[8]  Mathieu Raffinot,et al.  The Algorithmic of Gene Teams , 2002, WABI.

[9]  Cedric Chauve,et al.  Formal Models of Gene Clusters , 2007 .

[10]  Jens Stoye,et al.  Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences , 2004, CPM.

[11]  A. Litman,et al.  On covering problems of codes , 1997, Theory of Computing Systems.

[12]  Rolf Niedermeier,et al.  Fixed-Parameter Algorithms for CLOSEST STRING and Related Problems , 2003, Algorithmica.

[13]  Jens Stoye,et al.  Character sets of strings , 2007, J. Discrete Algorithms.

[14]  Jens Stoye,et al.  Finding All Common Intervals of k Permutations , 2001, CPM.

[15]  Jens Stoye,et al.  Algorithms for Finding Gene , 2001 .

[16]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[17]  Xin He,et al.  Identifying Conserved Gene Clusters in the Presence of Homology Families , 2005, J. Comput. Biol..

[18]  Bin Ma,et al.  More Efficient Algorithms for Closest String and Substring Problems , 2008, SIAM J. Comput..

[19]  Mathieu Raffinot,et al.  Computing Common Intervals of K Permutations, with Applications to Modular Decomposition of Graphs , 2005, ESA.

[20]  Dannie Durand,et al.  The Incompatible Desiderata of Gene Cluster Properties , 2005, Comparative Genomics.