Discovery and Analyses of Caulimovirid-like Sequences in Upland Cotton (Gossypium hirsutum)

Analyses of Illumina-based high-throughput sequencing data generated during characterization of the cotton leafroll dwarf virus population in Mississippi (2020–2022) consistently yielded contigs varying in size (most frequently from 4 to 7 kb) with identical nucleotide content and sharing similarities with reverse transcriptases (RTases) encoded by extant plant pararetroviruses (family Caulimoviridiae). Initial data prompted an in-depth study involving molecular and bioinformatic approaches to characterize the nature and origins of these caulimovirid-like sequences. As a result, here, we report on endogenous viral elements (EVEs) related to extant members of the family Caulimoviridae, integrated into a genome of upland cotton (Gossypium hirsutum), for which we propose the provisional name “endogenous cotton pararetroviral elements” (eCPRVE). Our investigations pinpointed a ~15 kbp-long locus on the A04 chromosome consisting of head-to-head orientated tandem copies located on positive- and negative-sense DNA strands (eCPRVE+ and eCPRVE-). Sequences of the eCPRVE+ comprised nearly complete and slightly decayed genome information, including ORFs coding for the viral movement protein (MP), coat protein (CP), RTase, and transactivator/viroplasm protein (TA). Phylogenetic analyses of major viral proteins suggest that the eCPRVE+ may have been initially derived from a genome of a cognate virus belonging to a putative new genus within the family. Unexpectedly, an identical 15 kb-long locus composed of two eCPRVE copies was also detected in a newly recognized species G. ekmanianum, shedding some light on the relatively recent evolution within the cotton family.

[1]  R. Kemerait,et al.  Cotton leafroll dwarf disease: An enigmatic viral disease in cotton , 2023, Molecular plant pathology.

[2]  T. Allen,et al.  First report of pothos latent virus infecting upland cotton (Gossypium hirsutum) in the United States. , 2022, Plant disease.

[3]  C. Vicient,et al.  Genome-wide identification of Reverse Transcriptase domains of recently inserted endogenous plant pararetrovirus (Caulimoviridae) , 2022, Frontiers in Plant Science.

[4]  Maojun Wang,et al.  Genomic innovation and regulatory rewiring during evolution of the cotton genus Gossypium , 2022, Nature Genetics.

[5]  A. Valli,et al.  Rearranged Endogenized Plant Pararetroviruses as Evidence of Heritable RNA-based Immunity , 2022, Molecular biology and evolution.

[6]  M. Van Montagu,et al.  Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons , 2022, Proceedings of the National Academy of Sciences of the United States of America.

[7]  L. Nemchinov,et al.  Genome-wide identification of endogenous viral sequences in alfalfa (Medicago sativa L.) , 2021, Virology journal.

[8]  Florian Maumus,et al.  Insertion of Badnaviral DNA in the Late Blight Resistance Gene (R1a) of Brinjal Eggplant (Solanum melongena) , 2021, Frontiers in Plant Science.

[9]  P. Bork,et al.  Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation , 2021, Nucleic Acids Res..

[10]  J. Dunwell,et al.  Identification and distribution of novel badnaviral sequences integrated in the genome of cacao (Theobroma cacao) , 2021, Scientific Reports.

[11]  Kathrin M. Seibt,et al.  Broken, silent, and in hiding: Tamed endogenous pararetroviruses escape elimination from the genome of sugar beet (Beta vulgaris). , 2021, Annals of botany.

[12]  Guanjing Hu,et al.  Parallel and Intertwining Threads of Domestication in Allopolyploid Cotton , 2021, Advanced science.

[13]  R. Hull,et al.  ICTV Virus Taxonomy Profile: Caulimoviridae , 2020, The Journal of general virology.

[14]  Don C. Jones,et al.  Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement , 2020, Nature Genetics.

[15]  R. Nichols,et al.  First Report of Cotton Leafroll Dwarf Virus in Upland Cotton (Gossypium hirsutum) in Mississippi , 2019, Plant Disease.

[16]  Tianzhen Zhang,et al.  Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton , 2019, Nature Genetics.

[17]  Xiuxin Deng,et al.  Endogenous pararetrovirus sequences are widely present in Citrinae genomes. , 2019, Virus research.

[18]  Mark A. Arick,et al.  Insights into the Evolution of the New World Diploid Cottons (Gossypium, Subgenus Houzingenia) Based on Genome Sequencing , 2018, Genome biology and evolution.

[19]  Kanae Yamada,et al.  Ancient Endogenous Pararetroviruses in Oryza Genomes Provide Insights into the Heterogeneity of Viral Gene Macroevolution , 2018, Genome biology and evolution.

[20]  E. Koonin,et al.  Ortervirales: New Virus Order Unifying Five Families of Reverse-Transcribing Viruses , 2018, Journal of Virology.

[21]  Guan-Zhu Han,et al.  Euphyllophyte Paleoviruses Illuminate Hidden Diversity and Macroevolutionary Mode of Caulimoviridae , 2018, Journal of Virology.

[22]  Florian Maumus,et al.  Tracheophyte genomes keep track of the deep evolution of the Caulimoviridae , 2017, Scientific Reports.

[23]  A. von Haeseler,et al.  UFBoot2: Improving the Ultrafast Bootstrap Approximation , 2017, bioRxiv.

[24]  Thomas K. F. Wong,et al.  ModelFinder: Fast Model Selection for Accurate Phylogenetic Estimates , 2017, Nature Methods.

[25]  J. Wendel,et al.  A New Species of Cotton from Wake Atoll, Gossypium stephensii (Malvaceae) , 2017, Systematic Botany.

[26]  J. Wendel,et al.  Taxonomy and Evolution of the Cotton Genus, Gossypium , 2015 .

[27]  Don C. Jones,et al.  CottonGen: The Community Database for Cotton Genomics, Genetics, and Breeding Research , 2015, Plants.

[28]  M. Zytnicki,et al.  Endogenous florendoviruses are major components of plant genomes and hallmarks of virus evolution , 2014, Nature Communications.

[29]  A. von Haeseler,et al.  IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies , 2014, Molecular biology and evolution.

[30]  M. Chabannes,et al.  Endogenous pararetroviruses--a reservoir of virus infection in plants. , 2013, Current opinion in virology.

[31]  V. Barbe,et al.  Three Infectious Viral Species Lying in Wait in the Banana Genome , 2013, Journal of Virology.

[32]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[33]  A. Katzourakis,et al.  Paleovirology and virally derived immunity. , 2012, Trends in ecology & evolution.

[34]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[35]  J. Wendel,et al.  Assessing the monophyly of polyploid Gossypium species , 2012, Plant Systematics and Evolution.

[36]  W. Sakamoto,et al.  Widespread Endogenization of Genome Sequences of Non-Retroviral RNA Viruses into Plant Genomes , 2011, PLoS pathogens.

[37]  V. Dolja,et al.  Retention of the virus-derived sequences in the nuclear genome of grapevine as a potential pathway to virus resistance , 2009, Biology Direct.

[38]  Toni Gabaldón,et al.  trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses , 2009, Bioinform..

[39]  R. Weiss The discovery of endogenous retroviruses , 2006, Retrovirology.

[40]  R. Corrêa,et al.  Molecular characterization of a virus from the family Luteoviridae associated with cotton blue disease , 2005, Archives of Virology.

[41]  I. Sela,et al.  Occurrence of a DNA sequence of a non-retro RNA virus in a host plant genome and its expression: evidence for recombination between viral and host RNAs. , 2005, Virology.

[42]  M. Matzke,et al.  A Distinct Endogenous Pararetrovirus Family in Nicotiana tomentosiformis, a Diploid Progenitor of Polyploid Tobacco1[w] , 2004, Plant Physiology.

[43]  T. Hohn,et al.  Induction of infectious petunia vein clearing (pararetro) virus from endogenous provirus in petunia , 2003, The EMBO journal.

[44]  R. Briddon,et al.  Geminivirus disease complexes: an emerging threat. , 2003, Trends in plant science.

[45]  M. Matzke,et al.  Endogenous viral sequences and their potential contribution to heritable virus resistance in plants , 2002, The EMBO journal.

[46]  M. Matzke,et al.  Integrated pararetroviral sequences define a unique class of dispersed repetitive DNA in plants. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[47]  M. Witty,et al.  Integration of multiple repeats of geminiviral DNA into the nuclear genome of tobacco during evolution. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[48]  A. Paterson,et al.  The distribution of Gossypium hirsutum chromatin in G. barbadense germ plasm: molecular analysis of introgressive plant breeding , 1995, Theoretical and Applied Genetics.

[49]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[50]  J. Wendel,et al.  Molecular confirmation of species status for the allopolyploid cotton species, Gossypium ekmanianum Wittmack , 2014, Genetic Resources and Crop Evolution.

[51]  J. Wendel,et al.  Ty1-copia-retrotransposon Behavior in a Polyploid Cotton , 2004, Chromosome Research.

[52]  B. Lockhart,et al.  Viral sequences integrated into plant genomes. , 2002, Annual review of phytopathology.

[53]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.