Virus Detection by High-Throughput Sequencing of Small RNAs: Large-Scale Performance Testing of Sequence Analysis Strategies.

Recent developments in high-throughput sequencing (HTS), also called next-generation sequencing (NGS), technologies and bioinformatics have drastically changed research on viral pathogens and spurred growing interest in the field of virus diagnostics. However, the reliability of HTS-based virus detection protocols must be evaluated before adopting them for diagnostics. Many different bioinformatics algorithms aimed at detecting viruses in HTS data have been reported but little attention has been paid thus far to their sensitivity and reliability for diagnostic purposes. Therefore, we compared the ability of 21 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 12 plant viruses through a double-blind large-scale performance test using 10 datasets of 21- to 24-nucleotide small RNA (sRNA) sequences from three different infected plants. The sensitivity of virus detection ranged between 35 and 100% among participants, with a marked negative effect when sequence depth decreased. The false-positive detection rate was very low and mainly related to the identification of host genome-integrated viral sequences or misinterpretation of the results. Reproducibility was high (91.6%). This work revealed the key influence of bioinformatics strategies for the sensitive detection of viruses in HTS sRNA datasets and, more specifically (i) the difficulty in detecting viral agents when they are novel or their sRNA abundance is low, (ii) the influence of key parameters at both assembly and annotation steps, (iii) the importance of completeness of reference sequence databases, and (iv) the significant level of scientific expertise needed when interpreting pipeline results. Overall, this work underlines key parameters and proposes recommendations for reliable sRNA-based detection of known and unknown viruses.

[1]  H. Czosnek,et al.  Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology , 2014, Viruses.

[2]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[3]  Neil Boonham,et al.  Next-generation sequencing and metagenomic analysis: a universal diagnostic tool in plant virology. , 2009, Molecular plant pathology.

[4]  P. Komínek Distribution of grapevine viruses in vineyards of the Czech Republic. , 2008 .

[5]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[6]  L. Farinelli,et al.  Evasion of Short Interfering RNA-Directed Antiviral Silencing in Musa acuminata Persistently Infected with Six Distinct Banana Streak Pararetroviruses , 2014, Journal of Virology.

[7]  Laurent Farinelli,et al.  De Novo Reconstruction of Consensus Master Genomes of Plant RNA and DNA Viruses from siRNAs , 2014, PloS one.

[8]  P. Roumagnac,et al.  Plant Virus Metagenomics: Advances in Virus Discovery. , 2015, Phytopathology.

[9]  Reinhard Simon,et al.  Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: a generic method for diagnosis, discovery and sequencing of viruses. , 2009, Virology.

[10]  Martin Vingron,et al.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels , 2012, Bioinform..

[11]  Thierry Candresse,et al.  Finding and identifying the viral needle in the metagenomic haystack: trends and challenges , 2015, Front. Microbiol..

[12]  D. Rees,et al.  High-throughput sequencing reveals small RNAs involved in ASGV infection , 2014, BMC Genomics.

[13]  A. Katzourakis,et al.  Endogenous viruses: Connecting recent and ancient viral evolution. , 2015, Virology.

[14]  D. Golino,et al.  Comparison of Next Generation Sequencing vs . Biological Indexing for the Optimal 1 Detection of Viral Pathogens in Grapevine 2 3 4 , 2015 .

[15]  Thierry Candresse,et al.  A Framework for the Evaluation of Biosecurity, Commercial, Regulatory, and Scientific Impacts of Plant Viruses and Viroids Identified by NGS Technologies , 2016, Front. Microbiol..

[16]  J. Burger,et al.  Next-generation sequencing for virus detection: covering all the bases , 2016, Virology Journal.

[17]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[18]  Olivier Voinnet,et al.  Antiviral Immunity Directed by Small RNAs , 2007, Cell.

[19]  Jonathan Pevsner,et al.  Basic Local Alignment Search Tool (BLAST) , 2005 .

[20]  S. Massart,et al.  Current impact and future directions of high throughput sequencing in plant virus diagnostics. , 2014, Virus research.

[21]  J. Kreuze,et al.  Complete genome sequences of new divergent potato virus X isolates and discrimination between strains in a mixed infection using small RNAs sequencing approach. , 2014, Virus research.

[22]  Yongjiang Zhang,et al.  Identification of viruses and viroids by next-generation sequencing and homology-dependent and homology-independent algorithms. , 2015, Annual review of phytopathology.

[23]  Adam Hunter,et al.  An internet-based bioinformatics toolkit for plant biosecurity diagnosis and surveillance of viruses and viroids , 2017, BMC Bioinformatics.

[24]  J. Valkonen,et al.  Diagnosis and discovery of fungal viruses using deep sequencing of small RNAs. , 2015, The Journal of general virology.