Unbiased K-mer Analysis Reveals Changes in Copy Number of Highly Repetitive Sequences During Maize Domestication and Improvement

The major component of complex genomes is repetitive elements, which remain recalcitrant to characterization. Using maize as a model system, we analyzed whole genome shotgun (WGS) sequences for the two maize inbred lines B73 and Mo17 using k-mer analysis to quantify the differences between the two genomes. Significant differences were identified in highly repetitive sequences, including centromere, 45S ribosomal DNA (rDNA), knob, and telomere repeats. Genotype specific 45S rDNA sequences were discovered. The B73 and Mo17 polymorphic k-mers were used to examine allele-specific expression of 45S rDNA in the hybrids. Although Mo17 contains higher copy number than B73, equivalent levels of overall 45S rDNA expression indicates that transcriptional or post-transcriptional regulation mechanisms operate for the 45S rDNA in the hybrids. Using WGS sequences of B73xMo17 doubled haploids, genomic locations showing differential repetitive contents were genetically mapped, which displayed different organization of highly repetitive sequences in the two genomes. In an analysis of WGS sequences of HapMap2 lines, including maize wild progenitor, landraces, and improved lines, decreases and increases in abundance of additional sets of k-mers associated with centromere, 45S rDNA, knob, and retrotransposons were found among groups, revealing global evolutionary trends of genomic repeats during maize domestication and improvement.

[1]  Peter J. Bradbury,et al.  High-resolution genetic mapping of maize pan-genome sequence anchors , 2015, Nature Communications.

[2]  V. Walbot,et al.  Evaluating quantitative variation in the genome of Zea mays. , 1986, Genetics.

[3]  R. Phillips,et al.  Complex structure of knob DNA on maize chromosome 9. Retrotransposon invasion into heterochromatin. , 1998, Genetics.

[4]  Cheng He,et al.  Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation , 2015, Scientific Reports.

[5]  Lijia Li,et al.  Characterization of a tandemly repeated subtelomeric sequence with inverted telomere repeats in maize. , 2009, Genome.

[6]  Damon Lisch,et al.  How important are transposons for plant evolution? , 2012, Nature Reviews Genetics.

[7]  B. Lemos,et al.  Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes , 2015, Proceedings of the National Academy of Sciences.

[8]  Ethalinda K. S. Cannon,et al.  Maize chromosomal knobs are located in gene-dense areas and suppress local recombination , 2012, Chromosoma.

[9]  B. McStay Nucleolar dominance: a model for rRNA gene silencing. , 2006, Genes & development.

[10]  William L Trimble,et al.  Rapid quantification of sequence repeats to resolve the size, structure and contents of bacterial genomes , 2013, BMC Genomics.

[11]  Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation , 2015 .

[12]  J. Doebley,et al.  Genetic signals of origin, spread, and introgression in a large sample of maize landraces , 2010, Proceedings of the National Academy of Sciences.

[13]  Sanzhen Liu,et al.  High-Throughput Genetic Mapping of Mutants via Quantitative Single Nucleotide Polymorphism Typing , 2010, Genetics.

[14]  Jeffrey Ross-Ibarra,et al.  Identification of a functional transposon insertion in the maize domestication gene tb1 , 2011, Nature Genetics.

[15]  O. Martin,et al.  Intraspecific variation of recombination rate in maize , 2013, Genome Biology.

[16]  J. Birchler,et al.  Chromosome painting using repetitive DNA sequences as probes for somatic chromosome identification in maize. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  J. Birchler,et al.  Diversity of Chromosomal Karyotypes in Maize and Its Relatives , 2010, Cytogenetic and Genome Research.

[18]  Patrick S. Schnable,et al.  Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content , 2009, PLoS genetics.

[19]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[20]  D. Ware,et al.  An ultra-high-density map as a community resource for discerning the genetic basis of quantitative traits in maize , 2015, BMC Genomics.

[21]  Adrian E. Raftery,et al.  MCLUST: Software for Model-Based Cluster Analysis , 1999 .

[22]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[23]  R. Dawe,et al.  Diversity and evolution of centromere repeats in the maize genome , 2014, bioRxiv.

[24]  Shaoli Wang,et al.  Flow cytometry and K-mer analysis estimates of the genome sizes of Bemisia tabaci B and Q (Hemiptera: Aleyrodidae) , 2015, Front. Physiol..

[25]  U. Paszkowski,et al.  Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers , 2013, Nature Biotechnology.

[26]  Michelle C. Stitzer,et al.  Transposable Elements Contribute to Activation of Maize Genes in Response to Abiotic Stress , 2014, bioRxiv.

[27]  B. Larkins,et al.  Dynamic Expression of Imprinted Genes Associates with Maternally Controlled Nutrient Allocation during Maize Endosperm Development[W][OPEN] , 2013, Plant Cell.

[28]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[29]  K. Arumuganathan,et al.  Physical mapping of 45S and 5S rDNA on maize metaphase and sorted chromosomes by FISH. , 2004, Hereditas.

[30]  F. Han,et al.  Telomere-mediated chromosomal truncation in maize , 2006, Proceedings of the National Academy of Sciences.

[31]  M. Chamberlin,et al.  Microsatellite megatracts in the maize (Zea mays L.) genome. , 2005, Genome.

[32]  T. Richmond,et al.  Changes in genome content generated via segregation of non-allelic homologs. , 2012, The Plant journal : for cell and molecular biology.

[33]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[34]  J. Birchler,et al.  Mitochondrial DNA Transfer to the Nucleus Generates Extensive Insertion Site Variation in Maize , 2008, Genetics.

[35]  KingsfordCarl,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011 .

[36]  Kevin L. Schneider,et al.  Inbreeding drives maize centromere evolution , 2016, Proceedings of the National Academy of Sciences.

[37]  Patrick S Schnable,et al.  Genetic Dissection of Intermated Recombinant Inbred Lines Using a New Genetic Map of Maize , 2006, Genetics.

[38]  D. Lisch Epigenetic regulation of transposable elements in plants. , 2009, Annual review of plant biology.

[39]  André Beló,et al.  Allelic genome structural variations in maize detected by array comparative genome hybridization , 2009, Theoretical and Applied Genetics.

[40]  J. Birchler,et al.  Retroelement Genome Painting: Cytological Visualization of Retroelement Expansions in the Genera Zea and Tripsacum , 2006, Genetics.

[41]  J. Sáez-Vásquez,et al.  Regulation of Pol I-transcribed 45S rDNA and Pol III-transcribed 5S rDNA in Arabidopsis. , 2012, Plant & cell physiology.

[42]  J. Chen,et al.  Genome-wide genetic changes during modern breeding of maize , 2012, Nature Genetics.

[43]  Pavel A Pevzner,et al.  How to apply de Bruijn graphs to genome assembly. , 2011, Nature biotechnology.

[44]  R. Phillips,et al.  A knob-associated tandem repeat in maize capable of forming fold-back DNA segments: are chromosome knobs megatransposons? , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[45]  F. Han,et al.  Distinct chromosomal distributions of highly repetitive sequences in maize , 2007, Chromosome Research.

[46]  Dmitry S. Ischenko,et al.  Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis , 2016, BMC Bioinformatics.

[47]  R. Phillips,et al.  Ribosomal RNA contents of maize genotypes with different ribosomal RNA gene numbers , 1984, Biochemical Genetics.

[48]  James C. Schnable,et al.  Nonsyntenic Genes Drive Highly Dynamic Complementation of Gene Expression in Maize Hybrids[W] , 2014, Plant Cell.

[49]  Peter J. Bradbury,et al.  Maize HapMap2 identifies extant variation from a genome in flux , 2012, Nature Genetics.

[50]  C. Scheuring,et al.  Preparation of megabase-sized DNA from a variety of organisms using the nuclei method for advanced genomics research , 2012, Nature Protocols.

[51]  Wei Zhu,et al.  Comparative Analysis of Genome-Wide Chromosomal Histone Modification Patterns in Maize Cultivars and Their Wild Relatives , 2014, PloS one.

[52]  D. Shippen,et al.  Plant Telomere Biology , 2004, The Plant Cell Online.

[53]  Kevin L. Schneider,et al.  Maize Centromere Structure and Evolution: Sequence Analysis of Centromeres 2 and 5 Reveals Dynamic Loci Shaped Primarily by Retrotransposons , 2009, PLoS genetics.

[54]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[55]  R. Phillips,et al.  The nucleolus organizer region of maize (Zea mays L.) , 1974, Chromosoma.

[56]  Hao Wu,et al.  R/qtl: QTL Mapping in Experimental Crosses , 2003, Bioinform..

[57]  S. Ferrari,et al.  Author contributions , 2021 .

[58]  R. C. Marucci,et al.  Maize , 2021, Natural Enemies of Insect Pests in Neotropical Agroecosystems.

[59]  Mattias Jakobsson,et al.  The origin and evolution of maize in the Southwestern United States , 2015, Nature Plants.

[60]  Peter Tiffin,et al.  Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. , 2010, Genome research.

[61]  B. Burr,et al.  Pinning down loose ends: mapping telomeres and factors affecting their length. , 1992, The Plant cell.

[62]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[63]  D. Piperno,et al.  Starch grain and phytolith evidence for early ninth millennium B.P. maize from the Central Balsas River Valley, Mexico , 2009, Proceedings of the National Academy of Sciences.

[64]  J. Doebley,et al.  A single domestication for maize shown by multilocus microsatellite genotyping , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[65]  Robert J. Elshire,et al.  A First-Generation Haplotype Map of Maize , 2009, Science.