On the tandem duplication-random loss model of genome rearrangement

We initiate the algorithmic study of a new model of genome rearrangement, the tandem duplication-random loss model, in which a genome evolves via successive rounds of tandem duplication of a contiguous segment of genes, followed by the loss of one copy of each of the duplicated genes. This model is well-known in the evolutionary biology literature, where it has been used to explain many of the known rearrangements in vertebrate mitochondrial genomes. Based on the model, we formalize a notion of distance between two genomes and show how to compute it efficiently for two interesting regions of the parameter space. We then consider median problems (i.e. finding the point which minimizes the sum of distances to a given set of points under some distance function) in the context of maximum parsimony phylogenetic reconstruction for these two special cases. Surprisingly, one of them turns out to correspond to the well-known rank aggregation problem, while the other corresponds to the biologically interesting case of whole genome duplication and loss, and we give an O(log log n) additive approximation algorithm for the latter.

[1]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[2]  W. Brown,et al.  EVOLUTION OF ANIMAL MITOCHONDRIAL DNA: RELEVANCE FOR POPULATION BIOLOGY AND SYSTEMATICS , 1987 .

[3]  Joseph Naor,et al.  Approximating Minimum Feedback Sets and Multi-Cuts in Directed Graphs , 1995, IPCO.

[4]  Ron Shamir,et al.  The median problems for breakpoints are NP-complete , 1998, Electron. Colloquium Comput. Complex..

[5]  Alberto Caprara,et al.  Formulations and hardness of multiple sorting by reversals , 1999, RECOMB.

[6]  S. Bensch,et al.  Mitochondrial genomic rearrangements in songbirds. , 2000, Molecular biology and evolution.

[7]  Nadia El-Mabrouk,et al.  Genome Rearrangement by Reversals and Insertions/Deletions of Contiguous Segments , 2000, CPM.

[8]  J. Boore The duplication/random loss model for gene rearrangement exemplified by mitochondrial genomes of deu , 2000 .

[9]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[10]  J. Boore,et al.  The complete mitochondrial genome of the articulate brachiopod Terebratalia transversa. , 2001, Molecular biology and evolution.

[11]  J. Boore,et al.  Complete mtDNA sequences of two millipedes suggest a new model for mitochondrial gene rearrangements: duplication and nonrandom loss. , 2002, Molecular biology and evolution.

[12]  Alan M. Frieze,et al.  A new rounding procedure for the assignment problem with applications to dense graph arrangement problems , 2002, Math. Program..

[13]  Krister M. Swenson,et al.  Genomic Distances under Deletions and Insertions , 2003, COCOON.

[14]  J. Inoue,et al.  Evolution of the deep-sea gulper eel mitochondrial genomes: large-scale gene rearrangements originated within the eels. , 2003, Molecular biology and evolution.

[15]  István Miklós,et al.  Genome Rearrangement in Mitochondria and Its Computational Biology , 2004, Comparative Genomics.

[16]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[17]  Jijun Tang,et al.  Phylogenetic reconstruction from arbitrary gene-order data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[18]  B. Birren,et al.  Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae , 2004, Nature.

[19]  Olivier Gascuel,et al.  Mathematics of Evolution and Phylogeny , 2005 .

[20]  J. Boore,et al.  Molecular mechanisms of extensive mitochondrial gene rearrangement in plethodontid salamanders. , 2005, Molecular biology and evolution.

[21]  Jijun Tang,et al.  Reconstructing phylogenies from gene-content and gene-order data , 2007, Mathematics of Evolution and Phylogeny.

[22]  Nir Ailon,et al.  Aggregating inconsistent information: Ranking and clustering , 2008 .

[23]  Krister M. Swenson,et al.  Approximating the true evolutionary distance between two genomes , 2008, JEAL.