The infinite sites model of genome evolution

We formalize the problem of recovering the evolutionary history of a set of genomes that are related to an unseen common ancestor genome by operations of speciation, deletion, insertion, duplication, and rearrangement of segments of bases. The problem is examined in the limit as the number of bases in each genome goes to infinity. In this limit, the chromosomes are represented by continuous circles or line segments. For such an infinite-sites model, we present a polynomial-time algorithm to find the most parsimonious evolutionary history of any set of related present-day genomes.

[1]  D. Sankoff,et al.  Comparative Genomics: "Empirical And Analytical Approaches To Gene Order Dynamics, Map Alignment And The Evolution Of Gene Families" , 2000 .

[2]  Phil Trinh,et al.  Chromosomal Breakpoint Reuse in Genome Sequence Rearrangement , 2005, J. Comput. Biol..

[3]  S. Jeffery Evolution of Protein Molecules , 1979 .

[4]  M. Kimura The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. , 1969, Genetics.

[5]  A. Edwards,et al.  The reconstruction of evolution , 1963 .

[6]  D. Roth,et al.  Nonhomologous recombination in mammalian cells: role for short sequence homologies in the joining reaction , 1986, Molecular and cellular biology.

[7]  Krister M. Swenson,et al.  Genomic Distances under Deletions and Insertions , 2003, COCOON.

[8]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[9]  Benjamin J. Raphael,et al.  Reconstructing tumor amplisomes , 2004, ISMB/ECCB.

[10]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[11]  P. Pevzner,et al.  Dynamics of Mammalian Chromosome Evolution Inferred from Multispecies Comparative Maps , 2005, Science.

[12]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[13]  Bin Ma,et al.  From Gene Trees to Species Trees , 2000, SIAM J. Comput..

[14]  Oliver Eulenstein,et al.  Heuristics for the Gene-Duplication Problem: A Theta ( n ) Speed-Up for the Local Search , 2007, RECOMB.

[15]  Matthew D. Rasmussen,et al.  Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. , 2007, Genome research.

[16]  Pavel A. Pevzner,et al.  Are There Rearrangement Hotspots in the Human Genome? , 2007, PLoS Comput. Biol..

[17]  R. Gibbs,et al.  PipMaker--a web server for aligning two genomic DNA sequences. , 2000, Genome research.

[18]  N. Kleckner,et al.  The leptotene-zygotene transition of meiosis. , 1998, Annual review of genetics.

[19]  Dannie Durand,et al.  A hybrid micro-macroevolutionary approach to gene tree reconstruction. , 2006 .

[20]  S. Pääbo,et al.  Genetic analyses from ancient DNA. , 2004, Annual review of genetics.

[21]  Webb Miller,et al.  Using genomic data to unravel the root of the placental mammal phylogeny. , 2007, Genome research.

[22]  Benjamin J. Raphael,et al.  Microinversions in mammalian evolution , 2006, Proceedings of the National Academy of Sciences.

[23]  Temple F. Smith,et al.  Reconstruction of ancient molecular phylogeny. , 1996, Molecular phylogenetics and evolution.

[24]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Pavel A. Pevzner,et al.  Whole Genome Duplications and Contracted Breakpoint Graphs , 2007, SIAM J. Comput..

[26]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[27]  M. Kimura A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences , 1980, Journal of Molecular Evolution.

[28]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[29]  W. Fitch Homology a personal view on some of the problems. , 2000, Trends in genetics : TIG.

[30]  Dannie Durand,et al.  NOTUNG: A Program for Dating Gene Duplications and Optimizing Gene Family Trees , 2000, J. Comput. Biol..

[31]  David A. Bader,et al.  A New Implmentation and Detailed Study of Breakpoint Analysis , 2000, Pacific Symposium on Biocomputing.

[32]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics.

[33]  R. Hudson Properties of a neutral allele model with intragenic recombination. , 1983, Theoretical population biology.

[34]  Bronwen L. Aken,et al.  Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences , 2007, Nature.

[35]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[36]  David Sankoff,et al.  The Reconstruction of Doubled Genomes , 2003, SIAM J. Comput..

[37]  Alex Bateman,et al.  QuickTree: building huge Neighbour-Joining trees of protein sequences , 2002, Bioinform..

[38]  Roded Sharan,et al.  A 1.5-approximation algorithm for sorting by transpositions and transreversals , 2004, J. Comput. Syst. Sci..

[39]  E. Eichler,et al.  Structural Dynamics of Eukaryotic Chromosome Evolution , 2003, Science.

[40]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[41]  J. A. Studier,et al.  A note on the neighbor-joining algorithm of Saitou and Nei. , 1988, Molecular biology and evolution.

[42]  W. A. Beyer,et al.  Additive evolutionary trees. , 1977, Journal of theoretical biology.

[43]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[44]  W. Fitch Distinguishing homologous from analogous proteins. , 1970, Systematic zoology.

[45]  Jian Ma,et al.  DUPCAR: Reconstructing Contiguous Ancestral Regions with Duplications , 2008, J. Comput. Biol..

[46]  J. Haber,et al.  Cell cycle and genetic requirements of two pathways of nonhomologous end-joining repair of double-strand breaks in Saccharomyces cerevisiae , 1996, Molecular and cellular biology.

[47]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[48]  Peer Bork,et al.  Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages. , 2005, Genome research.

[49]  D. Sankoff,et al.  Duplication, Rearrangement, and Reconciliation , 2000 .

[50]  David Sankoff,et al.  Genome rearrangement with gene families , 1999, Bioinform..

[51]  Joshua M. Korn,et al.  Mapping and sequencing of structural variation from eight human genomes , 2008, Nature.

[52]  Webb Miller,et al.  zPicture: dynamic alignment and visualization tool for analyzing conservation profiles. , 2004, Genome research.

[53]  G. Moore,et al.  Fitting the gene lineage into its species lineage , 1979 .