Viral Quasispecies Spectrum Reconstruction via Coloring the Vertex in the Weighted Read Conflict Graph

Viruses exist in their hosts as a collection of related viral haplotypes, called viral quasispecies. Since the composition of virus quasispecies is of clinical significance, assembling a group of viral quasispecies from a set of sequenced reads has become one of the most challenging problems in bioinformatics today. In this paper, a weighted read conflict graph is constructed by introducing fuzzy distance and a given threshold, and a viral quasispecies assembly algorithm CWSS is proposed based on color coding technology. The CWSS algorithm colors all vertices according to their sum of edge weight and saturation degree, so that all adjacent vertices must have different colors. The time complexity of the CWSS algorithm is \(O(m^{2}n+mn\)). Simulated datasets of HIV quasispecies were adopted to compare the reconstruction performance of the CWSS algorithm and the Dsatur one, which resolves the reconstruction problem by coloring an unweighted read conflict graph. The experimental results show that algorithm CWSS can obtain much more accurate estimation of the number of quasispecies than the Dsatur algorithm and still performs well with reads (or read-pairs) of high error rates.

[1]  J. Margolick,et al.  Consistent Viral Evolutionary Changes Associated with the Progression of Human Immunodeficiency Virus Type 1 Infection , 1999, Journal of Virology.

[2]  M. Vignuzzi,et al.  Quasispecies diversity determines pathogenesis through cooperative interactions in a viral population , 2006, Nature.

[3]  Ion I. Mandoiu,et al.  Inferring viral quasispecies spectra from 454 pyrosequencing reads , 2011, BMC Bioinformatics.

[4]  Volker Roth,et al.  HIV Haplotype Inference Using a Propagating Dirichlet Process Mixture Model , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Haris Vikalo,et al.  aBayesQR: A Bayesian Method for Reconstruction of Viral Populations Characterized by Low Diversity , 2017, RECOMB.

[6]  Feng Gao,et al.  Diversity Considerations in HIV-1 Vaccine Selection , 2002, Science.

[7]  Alexander Schönhuth,et al.  Viral Quasispecies Assembly via Maximal Clique Enumeration , 2014, RECOMB.

[8]  Eleazar Eskin,et al.  Accurate viral population assembly from ultra-deep sequencing data , 2014, Bioinform..

[9]  Michael Monsour,et al.  Minority HIV-1 Drug Resistance Mutations Are Present in Antiretroviral Treatment–Naïve Populations and Associate with Reduced Treatment Efficacy , 2008, PLoS medicine.

[10]  Saman K. Halgamuge,et al.  ViQuaS: an improved reconstruction pipeline for viral quasispecies spectra generated by next-generation sequencing , 2015, Bioinform..

[11]  Faraz Hach,et al.  CoLoRMap: Correcting Long Reads by Mapping short reads , 2016, Bioinform..

[12]  Haixu Tang,et al.  Quasispecies reconstruction based on vertex coloring algorithms , 2014, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[13]  Martin Beer,et al.  Sequencing approach to analyze the role of quasispecies for classical swine fever. , 2013, Virology.

[14]  Haris Vikalo,et al.  QSdpR: Viral quasispecies reconstruction via correlation clustering. , 2017, Genomics.

[15]  Mattia C. F. Prosperi,et al.  QuRe: software for viral quasispecies reconstruction from next-generation sequencing data , 2012, Bioinform..

[16]  Volker Roth,et al.  Full-length haplotype reconstruction to infer the structure of heterogeneous virus populations , 2014, Nucleic acids research.

[17]  Piotr Berman,et al.  HCV Quasispecies Assembly Using Network Flows , 2008, ISBRA.

[18]  M A Nowak,et al.  Antigenic diversity thresholds and the development of AIDS. , 1991, Science.

[19]  Nicholas Eriksson,et al.  ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing data , 2011, BMC Bioinformatics.

[20]  Volker Roth,et al.  Probabilistic Inference of Viral Quasispecies Subject to Recombination , 2013, J. Comput. Biol..

[21]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[22]  Giovanni Ulivi,et al.  Combinatorial analysis and algorithms for quasispecies reconstruction using next-generation sequencing , 2011, BMC Bioinformatics.

[23]  Lior Pachter,et al.  Viral Population Estimation Using Pyrosequencing , 2007, PLoS Comput. Biol..

[24]  Nebojsa Jojic,et al.  Population Sequencing Using Short Reads: HIV as a Case Study , 2008, Pacific Symposium on Biocomputing.

[25]  Volker Roth,et al.  Deep Sequencing of a Genetically Heterogeneous Sample: Local Haplotype Reconstruction and Read Error Correction , 2009, RECOMB.