Megabase Level Sequencing Reveals Contrasted Organization and Evolution Patterns of the Wheat Gene and Transposable Element Spaces[W]

This article describes the molecular analysis of large contiguous sequences produced from the bread wheat genome. It provides novel insights into the number, distribution, and density of genes along chromosome 3B and reveals an unexpectedly high amount of noncollinear genes compared to model grass genomes. To improve our understanding of the organization and evolution of the wheat (Triticum aestivum) genome, we sequenced and annotated 13-Mb contigs (18.2 Mb) originating from different regions of its largest chromosome, 3B (1 Gb), and produced a 2x chromosome survey by shotgun Illumina/Solexa sequencing. All regions carried genes irrespective of their chromosomal location. However, gene distribution was not random, with 75% of them clustered into small islands containing three genes on average. A twofold increase of gene density was observed toward the telomeres likely due to high tandem and interchromosomal duplication events. A total of 3222 transposable elements were identified, including 800 new families. Most of them are complete but showed a highly nested structure spread over distances as large as 200 kb. A succession of amplification waves involving different transposable element families led to contrasted sequence compositions between the proximal and distal regions. Finally, with an estimate of 50,000 genes per diploid genome, our data suggest that wheat may have a higher gene number than other cereals. Indeed, comparisons with rice (Oryza sativa) and Brachypodium revealed that a high number of additional noncollinear genes are interspersed within a highly conserved ancestral grass gene backbone, supporting the idea of an accelerated evolution in the Triticeae lineages.

[1]  Pierre Sourdille,et al.  Insertion site-based polymorphism markers open new perspectives for genome saturation and marker-assisted selection in wheat. , 2010, Plant biotechnology journal.

[2]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[3]  A. Hufton,et al.  Polyploidy and genome restructuring: a variety of outcomes. , 2009, Current opinion in genetics & development.

[4]  J. Bennetzen,et al.  Do genetic recombination and gene density shape the pattern of DNA elimination in rice long terminal repeat retrotransposons? , 2009, Genome research.

[5]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[6]  Carol Soderlund,et al.  Sequencing, Mapping, and Analysis of 27,455 Maize Full-Length cDNAs , 2009, PLoS genetics.

[7]  M T Clegg,et al.  Genome comparisons reveal a dominant mechanism of chromosome number reduction in grasses and accelerated genome evolution in Triticeae , 2009, Proceedings of the National Academy of Sciences.

[8]  Joachim Messing,et al.  Reconstruction of monocotelydoneous proto-chromosomes reveals faster evolution in plants than in animals , 2009, Proceedings of the National Academy of Sciences.

[9]  B. Kronmiller,et al.  Computational Finishing of Large Sequence Contigs Reveals Interspersed Nested Repeats and Gene Islands in the rf1-Associated Region of Maize1[W][OA] , 2009, Plant Physiology.

[10]  Kazuo Shinozaki,et al.  TriFLDB: A Database of Clustered Full-Length Coding Sequences from Triticeae with Applications to Comparative Grass Genomics[C][W][OA] , 2009, Plant Physiology.

[11]  B. Gill,et al.  Nonadditive Expression of Homoeologous Genes Is Established Upon Polyploidization in Hexaploid Wheat , 2009, Genetics.

[12]  Jianbing Yan,et al.  Identification and characterization of CACTA transposable elements capturing gene fragments in maize , 2009 .

[13]  J. Dvorak,et al.  Structural characterization of Brachypodium genome and its syntenic relationship with rice and wheat , 2009, Plant Molecular Biology.

[14]  Mihaela M. Martis,et al.  The Sorghum bicolor genome and the diversification of grasses , 2009, Nature.

[15]  J. Salse,et al.  Comparative Genomics in the Triticeae , 2009 .

[16]  J. Doležel,et al.  Chromosome Genomics in the Triticeae , 2009 .

[17]  T. Wicker,et al.  Map-Based Cloning of Genes in Triticeae (Wheat and Barley) , 2009 .

[18]  S. Kurtz,et al.  A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes , 2008, BMC Genomics.

[19]  Francois Sabot,et al.  Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats , 2008, BMC Genomics.

[20]  Pierre Sourdille,et al.  A Physical Map of the 1-Gigabase Bread Wheat Chromosome 3B , 2008, Science.

[21]  A. Couloux,et al.  Dynamics and Differential Proliferation of Transposable Elements During the Evolution of the B and A Genomes of Wheat , 2008, Genetics.

[22]  R. Haselkorn,et al.  Acc homoeoloci and the evolution of wheat genomes , 2008, Proceedings of the National Academy of Sciences.

[23]  M. Martin-Magniette,et al.  Transcriptional and Metabolic Adjustments in ADP-Glucose Pyrophosphorylase-Deficient bt2 Maize Kernels1[W] , 2008, Plant Physiology.

[24]  J. Salse,et al.  Identification and Characterization of Shared Duplications between Rice and Wheat Provide New Insight into Grass Genome Evolution , 2008 .

[25]  J. Bennetzen,et al.  A unified classification system for eukaryotic transposable elements , 2007, Nature Reviews Genetics.

[26]  Rodrigo Lopez,et al.  Clustal W and Clustal X version 2.0 , 2007, Bioinform..

[27]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[28]  J. Bennetzen,et al.  A GeneTrek analysis of the maize genome , 2007, Proceedings of the National Academy of Sciences.

[29]  Beat Keller,et al.  Genome-wide comparative analysis of copia retrotransposons in Triticeae, rice, and Arabidopsis reveals conserved ancient evolutionary lineages and distinct dynamics of individual copia families. , 2007, Genome research.

[30]  J. Dvorak,et al.  Mechanisms and rates of birth and death of dispersed duplicated genes during the evolution of a multigene family in diploid and tetraploid wheats. , 2007, Molecular biology and evolution.

[31]  C. Feuillet,et al.  Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. , 2006, The Plant journal : for cell and molecular biology.

[32]  G. Xia,et al.  Homoeologous gene silencing in hexaploid wheat. , 2006, The Plant journal : for cell and molecular biology.

[33]  J. Dvorak,et al.  Molecular characterization of a diagnostic DNA marker for domesticated tetraploid wheat provides evidence for gene flow from wild tetraploid wheat to hexaploid wheat. , 2006, Molecular biology and evolution.

[34]  B. Gill,et al.  Gene evolution at the ends of wheat chromosomes. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[35]  I. Colas,et al.  Molecular characterization of Ph1 as a major chromosome pairing locus in polyploid wheat , 2006, Nature.

[36]  J. Bennetzen,et al.  Recombination, rearrangement, reshuffling, and divergence in a centromeric region of rice. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Jianxin Ma,et al.  Analysis and mapping of randomly chosen bacterial artificial chromosome clones from hexaploid bread wheat. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[38]  B. Birren,et al.  Structure and Architecture of the Maize Genome1[W] , 2005, Plant Physiology.

[39]  T. Shiina,et al.  Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome , 2005, Nucleic acids research.

[40]  W. McCombie,et al.  Differential methylation of genes and repeats in land plants. , 2005, Genome research.

[41]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[42]  M. Morgante,et al.  Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize , 2005, Nature Genetics.

[43]  Pierre Sourdille,et al.  Updating of transposable element annotations from large wheat genomic sequences reveals diverse activities and gene associations , 2005, Molecular Genetics and Genomics.

[44]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[45]  B. Gill,et al.  Sequence composition, organization, and evolution of the core Triticeae genome. , 2004, The Plant journal : for cell and molecular biology.

[46]  R. Wing,et al.  Sequence composition and genome organization of maize. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Miftahudin,et al.  A Chromosome Bin Map of 16,000 Expressed Sequence Tag Loci and Distribution of Genes Among the Three Genomes of Polyploid Wheat , 2004, Genetics.

[48]  Sean R. Eddy,et al.  Pack-MULE transposable elements mediate gene evolution in plants , 2004, Nature.

[49]  Jan Vrána,et al.  Dissecting large and complex genomes: flow sorting and BAC cloning of individual chromosomes from bread wheat. , 2004, The Plant journal : for cell and molecular biology.

[50]  Jianxin Ma,et al.  Rapid recent growth and divergence of rice nuclear genomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[51]  J. Grima-Pettenati,et al.  Identification of genes preferentially expressed during wood formation in Eucalyptus , 2004, Plant Molecular Biology.

[52]  R. Flavell,et al.  Characterisation of the wheat genome by renaturation kinetics , 1975, Chromosoma.

[53]  D. Sandhu,et al.  Demarcating the gene-rich regions of the wheat genome. , 2004, Nucleic acids research.

[54]  Junhua Peng,et al.  Synteny perturbations between wheat homoeologous chromosomes caused by locus duplications and deletions correlate with recombination rates , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[55]  Junhua Peng,et al.  Comparative DNA sequence analysis of wheat and rice genomes. , 2003, Genome research.

[56]  B. Keller,et al.  A large rearrangement involving genes and low-copy DNA interrupts the microcollinearity between rice and barley at the Rph7 locus. , 2003, Genetics.

[57]  Junhua Peng,et al.  The organization and rate of evolution of wheat genomes are correlated with recombination rates along chromosome arms. , 2003, Genome research.

[58]  Beat Keller,et al.  CACTA Transposons in Triticeae. A Diverse Family of High-Copy Repetitive Elements1 , 2003, Plant Physiology.

[59]  James K. M. Brown,et al.  Genome size reduction through illegitimate recombination counteracts genome expansion in Arabidopsis. , 2002, Genome research.

[60]  R. Haselkorn,et al.  Genes encoding plastid acetyl-CoA carboxylase and 3-phosphoglycerate kinase of the Triticum/Aegilops complex and the evolutionary history of polyploid wheat , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[62]  B. Gill,et al.  The colinearity of the Sh2/A1 orthologous region in rice, sorghum and maize is interrupted and accompanied by genome expansion in the triticeae. , 2002, Genetics.

[63]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[64]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[65]  Kim Rutherford,et al.  Artemis: sequence visualization and annotation , 2000, Bioinform..

[66]  Phillip SanMiguel,et al.  The paleontology of intergene retrotransposons of maize , 1998, Nature Genetics.

[67]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[68]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[69]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[70]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[71]  G. Bernardi,et al.  The distribution of genes in the genomes of Gramineae. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[72]  J. Jurka,et al.  Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[73]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[74]  R. Durbin,et al.  A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. , 1995, Gene.

[75]  S. Wessler,et al.  LTR-retrotransposons and MITEs: important players in the evolution of plant genomes. , 1995, Current opinion in genetics & development.

[76]  G. Bernardi,et al.  The gene distribution of the maize genome. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[77]  B. Gill,et al.  Standard karyotype and nomenclature system for description of chromosome bands and structural aberrations in wheat (Triticum aestivum) , 1991 .

[78]  P. A. Gandiljan The origin of Triticum spelta L. , 1967 .

[79]  E. R. Sears,et al.  The origin of Triticum spelta and its free-threshing hexaploid relatives. , 1946, The Journal of heredity.