The nature and genomic landscape of repetitive DNA classes in Chrysanthemum nankingense shows recent genomic changes

Abstract Background and Aims Tandemly repeated DNA and transposable elements represent most of the DNA in higher plant genomes. High-throughput sequencing allows a survey of the DNA in a genome, but whole-genome assembly can miss a substantial fraction of highly repeated sequence motifs. Chrysanthemum nankingense (2n = 2x = 18; genome size = 3.07 Gb; Asteraceae), a diploid reference for the many auto- and allopolyploids in the genus, was considered as an ancestral species and serves as an ornamental plant and high-value food. We aimed to characterize the major repetitive DNA motifs, understand their structure and identify key features that are shaped by genome and sequence evolution. Methods Graph-based clustering with RepeatExplorer was used to identify and classify repetitive motifs in 2.14 millions of 250-bp paired-end Illumina reads from total genomic DNA of C. nankingense. Independently, the frequency of all canonical motifs k-bases long was counted in the raw read data and abundant k-mers (16, 21, 32, 64 and 128) were extracted and assembled to generate longer contigs for repetitive motif identification. For comparison, long terminal repeat retrotransposons were checked in the published C. nankingense reference genome. Fluorescent in situ hybridization was performed to show the chromosomal distribution of the main types of repetitive motifs. Key Results Apart from rDNA (0.86 % of the total genome), a few microsatellites (0.16 %), and telomeric sequences, no highly abundant tandem repeats were identified. There were many transposable elements: 40 % of the genome had sequences with recognizable domains related to transposable elements. Long terminal repeat retrotransposons showed widespread distribution over chromosomes, although different sequence families had characteristic features such as abundance at or exclusion from centromeric or subtelomeric regions. Another group of very abundant repetitive motifs, including those most identified as low-complexity sequences (9.07 %) in the genome, showed no similarity to known sequence motifs or tandemly repeated elements. Conclusions The Chrysanthemum genome has an unusual structure with a very low proportion of tandemly repeated sequences (~1.02 %) in the genome, and a high proportion of low-complexity sequences, most likely degenerated remains of transposable elements. Identifying the presence, nature and genomic organization of major genome fractions enables inference of the evolutionary history of sequences, including degeneration and loss, critical to understanding biodiversity and diversification processes in the genomes of diploid and polyploid Chrysanthemum, Asteraceae and plants more widely.

[1]  G. Droc,et al.  A chromosome-level reference genome of Ensete glaucum gives insight into diversity, chromosomal and repetitive sequence evolution in the Musaceae , 2021, bioRxiv.

[2]  K. Richert-Pöggeler,et al.  Participation of Multifunctional RNA in Replication, Recombination and Regulation of Endogenous Plant Pararetroviruses (EPRVs) , 2021, Frontiers in Plant Science.

[3]  J. Fajkus,et al.  The rDNA Loci—Intersections of Replication, Transcription, and Repair Pathways , 2021, International journal of molecular sciences.

[4]  Kathrin M. Seibt,et al.  Broken, silent, and in hiding: Tamed endogenous pararetroviruses escape elimination from the genome of sugar beet (Beta vulgaris) , 2020, bioRxiv.

[5]  J. Wen,et al.  Origins of cultivars of Chrysanthemum—Evidence from the chloroplast genome and nuclear LFY gene , 2020, Journal of Systematics and Evolution.

[6]  Pavel Neumann,et al.  Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2 , 2020, Nature Protocols.

[7]  J. Vrána,et al.  Fundamentally different repetitive element composition of sex chromosomes in Rumex acetosa. , 2020, Annals of botany.

[8]  H. Quesneville Twenty years of transposable element analysis in the Arabidopsis thaliana genome , 2020, Mobile DNA.

[9]  John S. Sproul,et al.  RepeatProfiler: a pipeline for visualization and comparative analysis of repetitive DNA profiles , 2020, bioRxiv.

[10]  M. Lexa,et al.  What Can Long Terminal Repeats Tell Us About the Age of LTR Retrotransposons, Gene Conversion and Ectopic Recombination? , 2020, Frontiers in Plant Science.

[11]  Chrysanthemum , 2020, Blooming Flowers.

[12]  C. Stritt,et al.  Diversity, dynamics and effects of long terminal repeat retrotransposons in the model grass Brachypodium distachyon , 2019, The New phytologist.

[13]  Shujun Ou,et al.  TEsorter: lineage-level classification of transposable elements using conserved protein domains , 2019, bioRxiv.

[14]  J. S. Heslop-Harrison,et al.  The repetitive DNA landscape in Avena (Poaceae): chromosome and genome evolution defined by major repeat classes in whole-genome sequence reads , 2019, BMC Plant Biology.

[15]  Qixiang Zhang,et al.  Characterization and Development of EST-SSR Markers from Transcriptome Sequences of Chrysanthemum (Chrysanthemum ×morifolium Ramat.) , 2019, HortScience.

[16]  S. Garcia,et al.  Reconstructing Phylogenetic Relationships Based on Repeat Sequence Similarities , 2019, bioRxiv.

[17]  S. Balasubramanian,et al.  Transposable elements drive rapid phenotypic variation in Capsella rubella , 2019, Proceedings of the National Academy of Sciences.

[18]  J. Macas,et al.  Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification , 2019, Mobile DNA.

[19]  Shilin Chen,et al.  The Chrysanthemum nankingense Genome Provides Insights into the Evolution and Diversification of Chrysanthemum Flowers and Medicinal Traits. , 2018, Molecular plant.

[20]  F. Denoeud,et al.  Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps , 2018, Nature Plants.

[21]  J. Schmitz,et al.  The impact of transposable elements in adaptive evolution , 2018, Molecular ecology.

[22]  Shujun Ou,et al.  LTR_retriever: A Highly Accurate and Sensitive Program for Identification of Long Terminal Repeat Retrotransposons1[OPEN] , 2017, Plant Physiology.

[23]  J. Casacuberta,et al.  Impact of transposable elements on polyploid plant genomes. , 2017, Annals of botany.

[24]  Chunguang Du,et al.  Rolling-circle amplification of centromeric Helitrons in plant genomes. , 2016, The Plant journal : for cell and molecular biology.

[25]  Pavel A. Pevzner,et al.  Assembly of long error-prone reads using de Bruijn graphs , 2016, Proceedings of the National Academy of Sciences.

[26]  J. Wen,et al.  Origin of Chrysanthemum cultivars — Evidence from nuclear low-copy LFY gene sequences , 2016 .

[27]  S. Jackson,et al.  Evolution of plant genome architecture , 2016, Genome Biology.

[28]  Yuliya V. Karpievitch,et al.  Population scale mapping of transposable element diversity reveals links to gene regulation and epigenomic variation , 2016, bioRxiv.

[29]  J. Macas,et al.  In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae , 2015, PloS one.

[30]  J. S. Heslop-Harrison,et al.  Repetitive DNA in eukaryotic genomes , 2015, Chromosome Research.

[31]  W. Jin,et al.  Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.) , 2015, BMC Plant Biology.

[32]  Shweta Mehrotra,et al.  Repetitive Sequences in Plant Nuclear DNA: Types, Distribution, Evolution and Function , 2014, Genom. Proteom. Bioinform..

[33]  Chunguang Du,et al.  HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes , 2014, Proceedings of the National Academy of Sciences.

[34]  C. Vitte,et al.  Transposable elements, a treasure trove to decipher epigenetic variation: insights from Arabidopsis and crop epigenomes. , 2014, Journal of experimental botany.

[35]  T. Bureau,et al.  Diversity and evolution of transposable elements in Arabidopsis , 2014, Chromosome Research.

[36]  A. Schulman Retrotransposon replication in plants. , 2013, Current opinion in virology.

[37]  T. Schwarzacher,et al.  Nucleosomes and centromeric DNA packaging , 2013, Proceedings of the National Academy of Sciences.

[38]  Jun Wang,et al.  Discrimination of different white chrysanthemum by electronic tongue , 2013, Journal of Food Science and Technology.

[39]  Petr Novák,et al.  RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads , 2013, Bioinform..

[40]  Thomas K. Wolfgruber,et al.  Tandem repeats derived from centromeric retrotransposons , 2013, BMC Genomics.

[41]  Jiming Jiang,et al.  Interstitial telomeric repeats are enriched in the centromeres of chromosomes in Solanum species , 2012, Chromosome Research.

[42]  Jiming Jiang,et al.  Repeatless and Repeat-Based Centromeres in Potato: Implications for Centromere Evolution[C][W] , 2012, Plant Cell.

[43]  M. Baker De novo genome assembly: what every biologist should know , 2012, Nature Methods.

[44]  P. Heslop-Harrison,et al.  Organisation of the plant genome in chromosomes. , 2011, The Plant journal : for cell and molecular biology.

[45]  Carl Kingsford,et al.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers , 2011, Bioinform..

[46]  Piotr A. Ziolkowski,et al.  Genome sequence comparison of Col and Ler lines reveals the dynamic nature of Arabidopsis chromosomes , 2009, Nucleic acids research.

[47]  Suzanne S. Sindi,et al.  Duplication count distributions in DNA sequences. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  A. Kovařík,et al.  Plant highly repeated satellite DNA: Molecular evolution, distribution and use for identification of hybrids , 2007 .

[49]  Jiming Jiang,et al.  Current status and the future of fluorescence in situ hybridization (FISH) in plant genome research. , 2006, Genome.

[50]  S. Jackson,et al.  Retrotransposon accumulation and satellite amplification mediated by segmental duplication facilitate centromere expansion in rice. , 2005, Genome research.

[51]  V. Pereira Insertion bias and purifying selection of retrotransposons in the Arabidopsis thaliana genome , 2004, Genome Biology.

[52]  Jianxin Ma,et al.  Analyses of LTR-retrotransposon structures reveal recent and rapid genomic DNA loss in rice. , 2004, Genome research.

[53]  S. Wright,et al.  Effects of recombination rate and gene density on transposable element distributions in Arabidopsis thaliana. , 2003, Genome research.

[54]  M. Murata,et al.  A centromeric tandem repeat family originating from a part of Ty3/gypsy-retroelement in wheat and its relatives. , 2003, Genetics.

[55]  J. S. Heslop-Harrison,et al.  LINEs and gypsy-like retrotransposons in Hordeum species , 2002, Plant Molecular Biology.

[56]  N. Bowen,et al.  Drosophila euchromatic LTR retrotransposons are much younger than the host species in which they reside. , 2001, Genome research.

[57]  T. Schmidt LINEs, SINEs and repetitive DNA: non-LTR retrotransposons in plant genomes , 1999, Plant Molecular Biology.

[58]  Bernard R. Baum,et al.  Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components , 1997, Plant Molecular Biology Reporter.

[59]  A. Flavell,et al.  Ty1-copia group retrotransposons are ubiquitous and heterogeneous in higher plants. , 1992, Nucleic acids research.

[60]  Motoo Kimura,et al.  On the stochastic model for estimation of mutational distance between homologous proteins , 1972, Journal of Molecular Evolution.

[61]  M. H. Shahrajabian A REVIEW OF CHRYSANTHEMUM, THE EASTERN QUEEN IN TRADITIONAL CHINESE MEDICINE WITH HEALING POWER IN MODERN PHARMACEUTICAL SCIENCES , 2019, Applied Ecology and Environmental Research.

[62]  T. Schwarzacher Preparation and Fluorescent Analysis of Plant Metaphase Chromosomes. , 2016, Methods in molecular biology.

[63]  S. Trivedi,et al.  DNA repetitive sequences-types, distribution and function: A review , 2010 .

[64]  N. Anderson Flower breeding and genetics: Issues, challenges and opportunities for the 21st century , 2006 .

[65]  C. Hansen,et al.  Sequences and Phylogenies of Plant Pararetroviruses, Viruses, and Transposable Elements , 2004 .

[66]  A. Flavell,et al.  Extreme heterogeneity of Ty1-copia group retrotransposons in plants , 2004, Molecular and General Genetics MGG.

[67]  S. Halford Practical In Situ Hybridization , 2000, Heredity.

[68]  J. S. Heslop-Harrison,et al.  Repetitive DNA sequences in Crocus vernus Hill (Iridaceae): the genomic organization and distribution of dispersed elements in the genus Crocus and its allies. , 2000, Genome.

[69]  R. Flavell Chromosomal DNA Sequences and Their Organization , 1982 .

[70]  Antonio Deiana,et al.  Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets , 2010, BMC Bioinformatics.