Improved Algorithms for Finding Gene Teams and Constructing Gene Team Trees

A gene team is a set of genes that appear in two or more species, possibly in a different order yet with the distance between adjacent genes in the team for each chromosome always no more than a certain threshold δ. A gene team tree is a succinct way to represent all gene teams for every possible value of δ. In this paper, improved algorithms are presented for the problem of finding the gene teams of two chromosomes and the problem of constructing a gene team tree of two chromosomes. For the problem of finding gene teams, Beal et al. had an O(n lg2 n)-time algorithm. Our improved algorithm requires O(n lg t) time, where t ≤ n is the number of gene teams. For the problem of constructing a gene team tree, Zhang and Leong had an O(n lg2 n)-time algorithm. Our improved algorithm requires O(n lg n lglg n) time. Similar to Beal et al.'s gene team algorithm and Zhang and Leong's gene team tree algorithm, our improved algorithms can be extended to k chromosomes with the time complexities increased only by a factor of k.

[1]  Takeaki Uno,et al.  Fast Algorithms to Enumerate All Common Intervals of Two Permutations , 1997, Algorithmica.

[2]  Hon Wai Leong,et al.  Gene Team Tree: A Hierarchical Representation of Gene Teams for All Gap Lengths , 2009, J. Comput. Biol..

[3]  Mathieu Raffinot,et al.  Fast algorithms for identifying maximal common connected sets of interval graphs , 2006, Discret. Appl. Math..

[4]  Mathieu Raffinot,et al.  An algorithmic view of gene teams , 2004, Theor. Comput. Sci..

[5]  S. Sitharama Iyengar,et al.  Introduction to parallel algorithms , 1998, Wiley series on parallel and distributed computing.

[6]  Peter van Emde Boas,et al.  Preserving Order in a Forest in Less Than Logarithmic Time and Linear Space , 1977, Inf. Process. Lett..

[7]  Xin He,et al.  Efficiently Identifying Max-Gap Clusters in Pairwise Genome Comparison , 2008, J. Comput. Biol..

[8]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[9]  P Bork,et al.  Gene context conservation of a higher order than operons. , 2000, Trends in biochemical sciences.

[10]  B. Snel,et al.  The identification of functional modules from the genomic association of genes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Xin He,et al.  Identifying Conserved Gene Clusters in the Presence of Homology Families , 2005, J. Comput. Biol..

[12]  Jens Stoye,et al.  Finding All Common Intervals of k Permutations , 2001, CPM.

[13]  Jens Stoye,et al.  Quadratic Time Algorithms for Finding Common Intervals in Two and More Sequences , 2004, CPM.

[14]  Gilles Didier,et al.  Common Intervals of Two Sequences , 2003, WABI.

[15]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[16]  J. Lawrence,et al.  Selfish operons: the evolutionary impact of gene clustering in prokaryotes and eukaryotes. , 1999, Current opinion in genetics & development.

[17]  S. Salzberg,et al.  Prediction of operons in microbial genomes. , 2001, Nucleic acids research.

[18]  Mathieu Raffinot,et al.  Gene teams: a new formalization of gene clusters for comparative genomics , 2003, Comput. Biol. Chem..

[19]  Joseph JáJá,et al.  An Introduction to Parallel Algorithms , 1992 .