Internal Validation of Ancestral Gene Order Reconstruction in Angiosperm Phylogeny

Whole genome doubling (WGD), a frequent occurrence during the evolution of the angiopsperms, complicates ancestral gene order reconstruction due to the multiplicity of solutions to the genome halving process. Using the genome of a related species (the outgroup) to guide the halving of a WGD descendant attenuates this problem. We investigate a battery of techniques for further improvement, including an unbiased version of the guided genome halving algorithm, reference to two related genomes instead of only one to guide the reconstruction, use of draft genome sequences in contig form only, incorporation of incomplete sets of homology correspondences among the genomes and addition of large numbers of "singleton" correspondences. We make use of genomic distance, breakpoint reuse rate, dispersion of sets of alternate solutions and other means to evaluate these techniques, while reconstructing the pre-WGD ancestor of Populus trichocarpaas well as an early rosid ancestor.

[1]  Dustin A. Cartwright,et al.  A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety , 2007, PloS one.

[2]  David Sankoff,et al.  The Signal in the Genomes , 2006, PLoS Comput. Biol..

[3]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[4]  D. Soltis,et al.  Widespread genome duplications throughout the history of flowering plants. , 2006, Genome research.

[5]  Vineet Bafna,et al.  Genome Rearrangements and Sorting by Reversals , 1996, SIAM J. Comput..

[6]  Stephen M. Mount,et al.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) , 2008, Nature.

[7]  David Sankoff,et al.  Genome Halving with an Outgroup , 2006 .

[8]  Jens Stoye,et al.  A Unifying View of Genome Rearrangements , 2006, WABI.

[9]  David Sankoff,et al.  Multichromosomal Genome Median and Halving Problems , 2008, WABI.

[10]  Rita Casadio,et al.  Algorithms in Bioinformatics, 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings , 2005, WABI.

[11]  David Sankoff,et al.  The Reconstruction of Doubled Genomes , 2003, SIAM J. Comput..

[12]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[13]  D. Sankoff,et al.  Polyploidy and angiosperm diversification. , 2009, American journal of botany.

[14]  P. Pevzner,et al.  Human and mouse genomic sequences reveal extensive breakpoint reuse in mammalian evolution , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  David Sankoff,et al.  Polyploids, genome halving and phylogeny , 2007, ISMB/ECCB.

[16]  David Sankoff,et al.  The effect of massive gene loss following whole genome duplication on the algorithmic reconstruction of the ancestral populus diploid. , 2008, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[17]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[18]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[19]  David Sankoff,et al.  Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes , 2008, ISMB.

[20]  David Sankoff,et al.  Descendants of Whole Genome Duplication within Gene Order Phylogeny , 2008, J. Comput. Biol..

[21]  Glenn Tesler,et al.  Efficient algorithms for multichromosomal genome rearrangements , 2002, J. Comput. Syst. Sci..

[22]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.