A novel virus with a (+) single-stranded RNA genome was detected by high-throughput sequencing (HTS) in a sample of grapevine (Vitis vinifera) cv. Kizil Sapak (sample/isolate 127) that originated from Turkmenistan. The complete genome of the virus, tentatively named “grapevine Kizil Sapak virus” (GKSV), is 7,604 nucleotides in length, excluding the poly(A) tail. The genome organization of GKSV, encoded genes, and sequence domains are typical for members of the family Betaflexiviridae, specifically those belonging to the subfamily Trivirinae. Phylogenetic analysis placed GKSV within the subfamily Trivirinae, in the same clade as fig latent virus 1 (FLV-1) but distinct from the clades formed by members of other genera. A comparative analysis of GKSV-127 with the HTS-derived sequences obtained from two additional isolates showed that they are genetic variants of the same virus species. Based on current ICTV species and genus demarcation criteria, and the results of the sequence and phylogenetic analyses, we propose that GKSV and FLV-1 represent a new genus within the subfamily Trivirinae. The family Betaflexiviridae is one of five families in the order Tymovirales, and it consists of the subfamilies Quinvirinae and Trivirinae (https ://talk.ictvo nline .org/taxon omy/), the latter of which is more diverse, including nine genera (Capillovirus, Chordovirus, Citrivirus, Divavirus, Prunevirus, Tepovirus, Trichovirus, Vitivirus and Wamavirus), whereas the former has only three (Carlavirus, Foveavirus and Robigovirus). A common feature of all members of the family is their flexuous filamentous virions of 12-13 nm in diameter and 600-1000 nm in length [1]. The capped (or probably capped) linear, positive-sense, single-stranded (ss) RNA genomes of betaflexiviruses range in length from 5.9 to 9.0 kb and are partitioned into two (Capillovirus) to five (Foveavirus and Vitivirus) open reading frames (ORFs) with different functions. Regardless of their subfamily and genus designations, each member typically codes for a replicase protein (Rep) with a size range of 190-250 kDa, a movement protein (MP) that is either of the “30K” superfamily or triple gene block (TGB) type, and a coat protein (CP) that ranges in size from 18 to 41 kDa [1]. Some members such as vitiviruses, carlaviruses, and trichoviruses may also code for a nucleic-acid-binding protein (NABP), which has been implicated in RNA silencing suppression activity, as documented for grapevine virus A [2–4]. Members of the Betaflexiviridae also show variation in their genome organization. For instance, whereas the “30K” type MP of capilloviruses is nested within ORF1, and that of citriviruses and trichoviruses overlap with the Rep, the vitivirus homologs are separated by a hypothetical protein and both proteins are separated by an intergenic non-coding sequence in chordoviruses. Also, viruses of the same genus or different genera can differ in their biological properties, and several members of the Betaflexiviridae have been shown to induce distinct Handling Editor: Ioannis E. Tzanetakis. Electronic supplementary material The online version of this article (https ://doi.org/10.1007/s0070 5-019-04434 -3) contains supplementary material, which is available to authorized users. * Maher Al Rwahnih malrwahnih@ucdavis.edu 1 Department of Plant Pathology, University of California, Davis, Davis, CA 95616, USA 2 Department of Plant Pathology and Microbiology, Texas A&M AgriLife Research and Extension Center, Weslaco, TX 78596, USA 3 Department of Evolution and Ecology, University of California, Davis, Davis, California 95616, USA M. A. Rwahnih et al. 1 3 symptoms in their primary and indicator hosts; several others cause asymptomatic infection. In 2014, a new selection of white table/wine grape (Vitis vinifera) cv. Kizil Sapak (sample 127) was received from Turkmenistan for inclusion in the Foundation Plant Services (FPS, University of California, Davis) collection. The vine was grown in a screenhouse and assayed for a panel of grapevine viruses in the FPS pipeline as described by Al Rwahnih et al. [5]. Furthermore, the material was subjected to highthroughput sequencing (HTS) analysis as part of the routine testing procedure at the FPS. Briefly, total nucleic acid (TNA) extracts made from leaf petioles of sample 127 using a MagMax Plant RNA Isolation Kit (Thermo Fisher Scientific) were used as template for cDNA library construction employing a TruSeq Stranded Total RNA with Ribo-Zero Plant Kit (Illumina) as per the manufacturer’s protocol. The cDNA library was sequenced using the Illumina NextSeq 500 platform, yielding 25.6 million single and 156.4 million paired-end raw reads, which were filtered and trimmed using Illumina bcl2fastq software. Viral sequences were obtained from SPADES v3.13.0 [6] assemblies of the deep paired-end Illumina sequence (Supplementary Table 1), preprocessed with FLASH2 [7]. The HTS analysis revealed a mixed infection of several grapevine viruses/viroids (data not shown) with one large contig of 7,590 nucleotides (nt) showing a distant relationship (39% to 48% identity; 42% to 63% coverage) with several members of the subfamily Trivirinae (family Betaflexiviridae) based on tBLASTx searches [8]. The genome sequence of the putative betaflexivirus from sample 127 was extended to completion by 5’ and 3’ RACE using a FirstChoice RLM-RACE Kit (Thermo Fisher Scientific) and determined to be 7,604 nt in length, excluding the poly(A) tail (GenBank no. MN172165). The virus was tentatively named “grapevine Kizil Sapak virus” (GKSV), since no discernible symptoms were associated with its occurrence. A search of the GKSV-127 sequence with the program ORF Finder (https ://www.ncbi.nlm.nih.gov/orffi nder/) revealed five potential protein-encoding segments, four of which were verified using the SMART BLAST or BLASTP tools and determined to show significant matches to the corresponding proteins of members of the family Betaflexiviridae. The 5′ untranslated region (UTR) of the virus is 170 nt long. The predicted ORF1 (nt position: 171-5,339) codes for a 196.7-kDa Rep, which is the typical size for members of the family Betaflexiviridae and within the size range for currently described vitiviruses (typically 190-200 kDa; [1]). Analysis of the Rep sequence using the Pfam program [9] led to the identification of conserved domains for methyltransferase (Mtr; nt position: 300-1,166), helicase (Hel; nt position: 2,907-3,644), and RNA-dependent RNA polymerase (RdRp; nt position: 4,113-5,276), all with highly significant E-values (>1.0). A pairwise comparison of this protein with homologs from the family produced the highest level of amino acid (aa) identity at 55.4% with the corresponding sequences of fig latent virus 1 (FLV-1; GenBank no. FN377573), followed by 32.6-33.4% with members of the genus Trichovirus. The predicted ORF2 (nt position 5,422-6,279) codes for a 31.7-kDa MP, and its Pfam analysis showed that it belonged to the “30K” superfamily type of MPs. Notably, the Rep and MP of GKSV-127 are separated by an 84-nt non-coding intergenic region (Fig. 1), and this was confirmed by reverse transcription PCR (RT-PCR) amplification of a 1.2-kb DNA fragment with primers designed downstream of the Rep and upstream of the MP, followed by Sanger sequencing of 20 independent recombinant clones (data not shown). The predicted ORF3 (nt position 6,158-6,754) overlaps with the MP and codes for a 22.0kDa CP (Fig. 1); its Pfam analysis showed that its only matches were to members of the trichovirus CP family (E-value: 5.4e-16). In pairwise comparisons, the CP of GKSV-127 shared the highest level of aa sequence identity, at 35.7%, with the corresponding sequences of FLV-1 (GenBank no. FN377573), and was 24.8-30.1% identical to those of members of the genus Trichovirus. The predicted ORF4 of GKSV-127 (nt position 6,784-7,365) is separated from the CP by a 162-nt non-coding intergenic region and codes for a 21.4-kDa protein of unknown function with no significant homology to known proteins based on Pfam analysis. The predicted ORF5 (nt position 7,2627,570) overlaps with the 21.4-kDa protein and codes for an 11.7-kDa NABP (Fig. 1); its Pfam analysis returned significant homology to the Carla_C4 family of the clan of viral NABP (E-value: 5.8e-08). The 3′ UTR of GKSV-127, excluding the poly(A) tail, is 35 nt long. Fig. 1 Genome organization of grapevine Kizil Sapak virus (GKSV). Five predicted open reading frames (ORFs) are shown as rectangular boxes: replicase (REP; ORF1; 196.7 kDa), nt 171-5339; movement protein (MP; ORF2; 31.7 kDa), nt 5422-6279; coat protein (CP; ORF3; 22.0 kDa), nt 6158-6754; hypothetical protein (ORF4; 21.4 kDa), nt 6784-7365; and nucleic acid binding protein (NABP, ORF5; 11.7 kDa), nt 7262-7570 Grapevine Kizil Sapak virus 1 3 MF774336-GVD-MD25 MF072319-GVK-Jo MG637048-GVJ-KS X75433-GVA-Is15 JX105428-GVF-AUD46129 MF521889-GVH-TT2016-3 MK492703-GVM-TX-WA JN427015-AcVB-TP7-93B JN427014-AcVA-TP7-93A AY913795-MV-2 X75448-GVB-Italy X79270-HLV-Scottish MG254193-BVA-Arkansas MF405923-GVG-VID561 MF927925-GVI-VID499 AB432910-GVE-TvAQ7 MH643739GVL-KA KY392781-AVV-CNPH EU835937-PVT KF700263-PrVT-Aze239 JX173276-DVA-SW3.3 JX173277-DVB-SW3.3 HQ241409-HarVA-57 KF533710-CChV-2-S15 KF533711-CChV-1-S20 KY363796-WVA-K15 AJ318061-CLBV-SRA-153 KR023647 -CLBV-Prunus HG008921-ApVCaV-VC MF440375-AcSBLV-01227 KM507061-CPV-Aze204 KT763043-CuVA X82547-CVA-DE MG783575-MuVA D16681-CTLV-L D14995-ASGV-P209 FN377573-FLV-1-f5p5 GKSV-127 AM920542-PhMV D88448-GINV-JP FR877530-GPGV DQ117579-PcMV202201-(CA-1) AF170028-CMLV-SA116221 AY713379-ApCLSV-Sus2 M58152-ACLSV-P863 AF238884-BotV-F 100
[1]
Silvio C. E. Tosatto,et al.
The Pfam protein families database in 2019
,
2018,
Nucleic Acids Res..
[2]
Sergey I. Nikolenko,et al.
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
,
2012,
J. Comput. Biol..
[3]
G. Martelli,et al.
Identification of an RNA-silencing suppressor in the genome of Grapevine virus A.
,
2006,
The Journal of general virology.
[4]
E. Koonin,et al.
Diverse suppressors of RNA silencing enhance agroinfection by a viral replicon.
,
2006,
Virology.
[5]
Thomas L. Madden,et al.
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
,
1997,
Nucleic acids research.
[6]
D. Golino,et al.
Comparison of Next-Generation Sequencing Versus Biological Indexing for the Optimal Detection of Viral Pathogens in Grapevine.
,
2015,
Phytopathology.
[7]
M. Mawassi,et al.
The ORF3-encoded proteins of vitiviruses GVA and GVB induce tubule-like and punctate structures during virus infection and localize to the plasmodesmata.
,
2012,
Virus research.
[8]
S. Salzberg,et al.
FLASH: fast length adjustment of short reads to improve genome assemblies
,
2011,
Bioinform..