Detecting patterns of co-variation in deep-sequenced virus populations

Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of heterogeneous virus populations at an unprecedented level of detail. However, the existence of technical errors confounds the identification of truthful variants. Here, we present a comparative approach for the identification of patterns of co-variation in deep-sequenced virus populations. In addition to sequencing errors, we account for other unknown sources of error by modeling the occurrences of patterns of mutations using the Dirichlet distribution as prior for the multinomial distribution.