Illumina-based sequencing framework for accurate detection and mapping of influenza virus defective interfering particle-associated RNAs

The mechanisms and consequences of defective interfering particle (DIP) formation during influenza virus infection remain poorly understood. The development of next generation sequencing (NGS) technologies has made it possible to identify large numbers of DIP-associated sequences, providing a powerful tool to better understand their biological relevance. However, NGS approaches pose numerous technical challenges including the precise identification and mapping of deletion junctions in the presence of frequent mutation and base-calling errors, and the potential for numerous experimental and computational artifacts. Here we detail an Illumina-based sequencing framework and bioinformatics pipeline capable of generating highly accurate and reproducible profiles of DIP-associated junction sequences. We use a combination of simulated and experimental control datasets to optimize pipeline performance and demonstrate the absence of significant artifacts. Finally, we use this optimized pipeline to generate a high-resolution profile of DIP-associated junctions produced during influenza virus infection and demonstrate how this data can provide insight into mechanisms of DIP formation. This work highlights the specific challenges associated with NGS-based detection of DIP-associated sequences, and details the computational and experimental controls required for such studies. Importance Influenza virus defective interfering particles (DIPs) that harbor internal deletions within their genomes occur naturally during infection in humans and cell culture. They have been hypothesized to influence the pathogenicity of the virus; however, their specific function remains elusive. The accurate detection of DIP-associated deletion junctions is crucial for understanding DIP biology but is complicated by an array of technical issue that can bias or confound results. Here we demonstrate a combined experimental and computational framework for detecting DIP-associated deletion junctions using next generation sequencing (NGS). We detail how to validate pipeline performance and provide the bioinformatics pipeline for groups interested in using it. Using this optimized pipeline, we detect hundreds of distinct deletion junctions generated during IAV infection, and use these data to test a long-standing hypothesis concerning the molecular details of DIP formation.

[1]  Cassandra B. Jabara,et al.  Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID , 2011, Proceedings of the National Academy of Sciences.

[2]  L. Sherry,et al.  Identification of cis-acting packaging signals in the coding regions of the influenza B virus HA gene segment. , 2016, The Journal of general virology.

[3]  P. Von Magnus Propagation of the PR8 strain of influenza A virus in chick embryos. II. The formation of incomplete virus following inoculation of large doses of seed virus. , 2009 .

[4]  Daniel H. Huson,et al.  MetaSim—A Sequencing Simulator for Genomics and Metagenomics , 2008, PloS one.

[5]  Timothy B. Stockwell,et al.  Sequence Analysis of In Vivo Defective Interfering-Like RNA of Influenza A H1N1 Pandemic Virus , 2013, Journal of Virology.

[6]  S. Hensley,et al.  Propagation and Characterization of Influenza Virus Stocks That Lack High Levels of Defective Viral Genomes and Hemagglutinin Mutations , 2016, Front. Microbiol..

[7]  A. Routh,et al.  Parallel ClickSeq and Nanopore sequencing elucidates the rapid evolution of defective-interfering RNAs in Flock House virus , 2017, PLoS pathogens.

[8]  Daniel R. Garalde,et al.  Highly parallel direct RNA sequencing on an array of nanopores , 2016, Nature Methods.

[9]  P. von Magnus Incomplete forms of influenza virus. , 1954, Advances in virus research.

[10]  R. Sachidanandam,et al.  Preference of RIG-I for short viral RNA molecules in infected cells revealed by next-generation sequencing , 2010, Proceedings of the National Academy of Sciences.

[11]  Julia R Gog,et al.  Genome packaging in influenza A virus. , 2010, The Journal of general virology.

[12]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[13]  R. Andino,et al.  Library preparation for highly accurate population sequencing of RNA viruses , 2014, Nature Protocols.

[14]  John E. Johnson,et al.  ClickSeq: Fragmentation-Free Next-Generation Sequencing via Click Ligation of Adaptors to Stochastically Terminated 3'-Azido cDNAs. , 2015, Journal of molecular biology.

[15]  M. Vignuzzi,et al.  The defective component of viral populations. , 2018, Current opinion in virology.

[16]  T. Dallman,et al.  Performance comparison of benchtop high-throughput sequencing platforms , 2012, Nature Biotechnology.

[17]  John E. Johnson,et al.  Discovery of functional genomic motifs in viruses with ViReMa–a Virus Recombination Mapper–for analysis of next-generation sequencing data , 2013, Nucleic acids research.

[18]  Yoshihiro Kawaoka,et al.  Single-Reaction Genomic Amplification Accelerates Sequencing and Vaccine Production for Classical and Swine Origin Human Influenza A Viruses , 2009, Journal of Virology.

[19]  A. Davis,et al.  Diversity and generation of defective interfering influenza virus particles. , 1979, Virology.

[20]  J. Oliveros,et al.  Reduced accumulation of defective viral genomes contributes to severe outcome in influenza virus infected patients , 2017, PLoS Pathogens.

[21]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[22]  D. Nayak,et al.  Defective-interfering (DI) RNAs of influenza viruses: origin, structure, expression, and interference. , 1985, Current topics in microbiology and immunology.

[23]  Daniel J. G. Lahr,et al.  Reducing the impact of PCR-mediated recombination in molecular evolution and environmental studies using a new-generation high-fidelity DNA polymerase. , 2009, BioTechniques.

[24]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[25]  C. Quince,et al.  Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform , 2015, Nucleic acids research.

[26]  Javier G. Magadán,et al.  Influenza A virus hemagglutinin glycosylation compensates for antibody escape fitness costs , 2018, PLoS pathogens.

[27]  J. Finch,et al.  Does the higher order structure of the influenza virus ribonucleoprotein guide sequence rearrangements in influenza viral RNA? , 1983, Cell.

[28]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[29]  Slave Trajanoski,et al.  The impact of PCR-generated recombination on diversity estimation of mixed viral populations by deep sequencing. , 2010, Journal of virological methods.

[30]  C. Brooke Population Diversity and Collective Interactions during Influenza Virus Infection , 2017, Journal of Virology.