Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps.

Large-scale (segmental or whole) genome duplication has been recurring in angiosperm evolution. Subsequent gene loss and rearrangements further affect gene copy numbers and fractionate ancestral gene linkages across multiple chromosomes. The fragmented "multiple-to-multiple" correspondences resulting from this distinguishing feature of angiosperm evolution complicates comparative genomic studies. Using a robust computational framework that combines information from multiple orthologous and duplicated regions to construct local syntenic networks, we show that a shared ancient hexaploidy event (or perhaps two roughly concurrent genome fusions) can be inferred based on the sequences from several divergent plant genomes. This "paleo-hexaploidy" clearly preceded the rosid-asterid split, but it remains equivocal whether it also affected monocots. The model resulting from our multi-alignments lays the foundation for approximating the number and arrangement of genes in the last universal common ancestor of angiosperms. Comparative analysis of inferred homologous genes derived from this model shows patterns of preferential gene retention or loss after polyploidy and reveals large variability of nucleotide substitution rates among plant nuclear genomes.

[1]  Andrew H. Paterson,et al.  Synteny and Collinearity in Plant Genomes , 2008, Science.

[2]  Stephen M. Mount,et al.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) , 2008, Nature.

[3]  Daniel J. Blankenberg,et al.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser. , 2007, Genome research.

[4]  Célia Baroux,et al.  Positive darwinian selection at the imprinted MEDEA locus in plants , 2007, Nature.

[5]  F. Feltus,et al.  Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence , 2007, Genetics.

[6]  Sean B. Carroll,et al.  Gene duplication and the adaptive evolution of a classic genetic switch , 2007, Nature.

[7]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[8]  Kevin P. Byrne,et al.  Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole-genome duplication , 2007, Proceedings of the National Academy of Sciences.

[9]  Stephane Rombauts,et al.  How many genes are there in plants (... and why are they there)? , 2007, Current opinion in plant biology.

[10]  Kanako O. Koyanagi,et al.  Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana. , 2007, Genome research.

[11]  R. Guigó,et al.  Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia , 2006, Nature.

[12]  F. Feltus,et al.  Many gene and domain families have convergent fates following independent whole-genome duplication events in Arabidopsis, Oryza, Saccharomyces and Tetraodon. , 2006, Trends in genetics : TIG.

[13]  Zhe Li,et al.  Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice , 2006, BMC Bioinformatics.

[14]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[15]  Brian C. Thomas,et al.  Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. , 2006, Genome research.

[16]  Brian C. Thomas,et al.  Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. , 2006, Genome research.

[17]  Peer Bork,et al.  PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments , 2006, Nucleic Acids Res..

[18]  D. Soltis,et al.  Widespread genome duplications throughout the history of flowering plants. , 2006, Genome research.

[19]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[20]  Mark W. Chase,et al.  Phylogeny and Evolution of Angiosperms , 2005 .

[21]  Pierre Baldi,et al.  Statistical detection of chromosomal homology using shared-gene density alone , 2005, Bioinform..

[22]  J. Raes,et al.  Modeling gene and genome duplications in eukaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Xingyi Guo,et al.  Two ancient rounds of polyploidy in rice genome. , 2005, Journal of Zhejiang University. Science. B.

[24]  Erik L. L. Sonnhammer,et al.  Inparanoid: a comprehensive database of eukaryotic orthologs , 2004, Nucleic Acids Res..

[25]  Steven Salzberg,et al.  DAGchainer: a tool for mining segmental genome duplications and synteny , 2004, Bioinform..

[26]  C. Langley,et al.  Comparing the Linkage Maps of the Close Relatives Arabidopsis lyrata and A. thaliana , 2004, Genetics.

[27]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[28]  Cathal Seoighe,et al.  Genome duplication led to highly selective expansion of the Arabidopsis thaliana proteome. , 2004, Trends in genetics : TIG.

[29]  Yves Van de Peer,et al.  Computational approaches to unveiling ancient genome duplications , 2004, Nature Reviews Genetics.

[30]  Guillaume Blanc,et al.  Widespread Paleopolyploidy in Model Plant Species Inferred from Age Distributions of Duplicate Genes , 2004, The Plant Cell Online.

[31]  B. Birren,et al.  Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae , 2004, Nature.

[32]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[33]  Pamela S Soltis,et al.  Darwin's abominable mystery: Insights from a supertree of the angiosperms , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[34]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[35]  John P. Huelsenbeck,et al.  MrBayes 3: Bayesian phylogenetic inference under mixed models , 2003, Bioinform..

[36]  Brad A. Chapman,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2003, Nature.

[37]  K. Hokamp,et al.  A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome. , 2003, Genome research.

[38]  J. Raes,et al.  The automatic detection of homologous regions (ADHoRe) and its application to microcolinearity between Arabidopsis and rice. , 2002, Genome research.

[39]  B. Shuai,et al.  The Lateral Organ Boundaries Gene Defines a Novel, Plant-Specific Gene Family1 , 2002, Plant Physiology.

[40]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[41]  Barry G. Hall,et al.  Phylogenetic Trees Made Easy: A How-To Manual , 2001 .

[42]  Wen Huang,et al.  The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant , 2001, Nucleic Acids Res..

[43]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[44]  M. A. Koch,et al.  Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). , 2000, Molecular biology and evolution.

[45]  S. Tanksley,et al.  Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[46]  J. Retief,et al.  Phylogenetic analysis using PHYLIP. , 2000, Methods in molecular biology.

[47]  David Posada,et al.  MODELTEST: testing the model of DNA substitution , 1998, Bioinform..

[48]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[49]  M T Clegg,et al.  Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[50]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[51]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[52]  M. Nei,et al.  Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. , 1986, Molecular biology and evolution.

[53]  R. Sokal,et al.  A METHOD FOR DEDUCING BRANCHING SEQUENCES IN PHYLOGENY , 1965 .

[54]  Yasuko Takahashi,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2022 .