Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations

Next-generation sequencing (NGS) technologies enable new insights into the diversity of virus populations within their hosts. Diversity estimation is currently restricted to single-nucleotide variants or to local fragments of no more than a few hundred nucleotides defined by the length of sequence reads. To study complex heterogeneous virus populations comprehensively, novel methods are required that allow for complete reconstruction of the individual viral haplotypes. Here, we show that assembly of whole viral genomes of ∼8600 nucleotides length is feasible from mixtures of heterogeneous HIV-1 strains derived from defined combinations of cloned virus strains and from clinical samples of an HIV-1 superinfected individual. Haplotype reconstruction was achieved using optimized experimental protocols and computational methods for amplification, sequencing and assembly. We comparatively assessed the performance of the three NGS platforms 454 Life Sciences/Roche, Illumina and Pacific Biosciences for this task. Our results prove and delineate the feasibility of NGS-based full-length viral haplotype reconstruction and provide new tools for studying evolution and pathogenesis of viruses.

[1]  Volker Roth,et al.  Probabilistic Inference of Viral Quasispecies Subject to Recombination , 2012, RECOMB.

[2]  J. Albert,et al.  Performance of Ultra-Deep Pyrosequencing in Analysis of HIV-1 pol Gene Variation , 2011, PloS one.

[3]  Takahiro Kanagawa,et al.  Bias and artifacts in multitemplate polymerase chain reactions (PCR). , 2003, Journal of bioscience and bioengineering.

[4]  Li Yin,et al.  Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges , 2013, Scientific Reports.

[5]  A. Samri,et al.  A patient with HIV-1 superinfection. , 2002, The New England journal of medicine.

[6]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[7]  Giovanni Ulivi,et al.  Combinatorial analysis and algorithms for quasispecies reconstruction using next-generation sequencing , 2011, BMC Bioinformatics.

[8]  Cassandra B. Jabara,et al.  Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID , 2011, Proceedings of the National Academy of Sciences.

[9]  K. Metzner,et al.  Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data , 2012, Front. Microbio..

[10]  Daniel H. Huson,et al.  48. MetaSim: A Sequencing Simulator for Genomics and Metagenomics , 2011 .

[11]  Daniel H. Huson,et al.  MetaSim—A Sequencing Simulator for Genomics and Metagenomics , 2008, PloS one.

[12]  E. Domingo,et al.  Viral Quasispecies Evolution , 2012, Microbiology and Molecular Reviews.

[13]  K. Metzner,et al.  Next-Generation Sequencing of HIV-1 RNA Genomes: Determination of Error Rates and Minimizing Artificial Recombination , 2013, PloS one.

[14]  Ion I. Mandoiu,et al.  Reconstruction of viral population structure from next-generation sequencing data using multicommodity flows , 2013, BMC Bioinformatics.

[15]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[16]  K. Metzner,et al.  Origin of minority drug-resistant HIV-1 variants in primary HIV-1 infection. , 2013, The Journal of infectious diseases.

[17]  Niko Beerenwinkel,et al.  Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies , 2010, Nucleic acids research.

[18]  M. Eigen,et al.  What is a quasispecies? , 2006, Current topics in microbiology and immunology.

[19]  H. Günthard,et al.  In Vivo and In Vitro Escape from Neutralizing Antibodies 2 G 12 , 2 F 5 , and 4 E 10 , 2007 .

[20]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[21]  Austin L. Hughes,et al.  Whole-Genome Characterization of Human and Simian Immunodeficiency Virus Intrahost Diversity by Ultradeep Pyrosequencing , 2010, Journal of Virology.

[22]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[23]  Huldrych F. Günthard,et al.  Whole Genome Deep Sequencing of HIV-1 Reveals the Impact of Early Minor Variants Upon Immune Recognition During Acute Infection , 2012, PLoS pathogens.

[24]  K. Metzner,et al.  Characterization of human immunodeficiency virus type 1 (HIV-1) diversity and tropism in 145 patients with primary HIV-1 infection. , 2011, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[25]  Niko Beerenwinkel,et al.  Read length versus Depth of Coverage for Viral Quasispecies Reconstruction , 2012, PloS one.

[26]  Christopher Quince,et al.  Benchmarking of viral haplotype reconstruction programmes: an overview of the capacities and limitations of currently available programmes , 2014, Briefings Bioinform..

[27]  Volker Roth,et al.  HIV Haplotype Inference Using a Propagating Dirichlet Process Mixture Model , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[28]  R. Stepanauskas Single cell genomics: an individual look at microbes. , 2012, Current opinion in microbiology.

[29]  Art F. Y. Poon,et al.  Reconstructing the Dynamics of HIV Evolution within Hosts from Serial Deep Sequence Data , 2012, PLoS Comput. Biol..

[30]  A. Trkola,et al.  In Vivo and In Vitro Escape from Neutralizing Antibodies 2G12, 2F5, and 4E10 , 2007, Journal of Virology.

[31]  Michael C. Zody,et al.  Highly Sensitive and Specific Detection of Rare Variants in Mixed Viral Populations from Massively Parallel Sequence Data , 2012, PLoS Comput. Biol..

[32]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[33]  B. Korber,et al.  Deciphering Human Immunodeficiency Virus Type 1 Transmission and Early Envelope Diversification by Single-Genome Amplification and Sequencing , 2008, Journal of Virology.

[34]  F. Bushman,et al.  A Maraviroc-Resistant HIV-1 with Narrow Cross-Resistance to Other CCR5 Antagonists Depends on both N-Terminal and Extracellular Loop Domains of Drug-Bound CCR5 , 2010, Journal of Virology.

[35]  James Theiler,et al.  Quantitative Deep Sequencing Reveals Dynamic HIV-1 Escape and Large Population Shifts during CCR5 Antagonist Therapy In Vivo , 2009, PloS one.