Viral Genetic Linkage Analysis in the Presence of Missing Data

Analyses of viral genetic linkage can provide insight into HIV transmission dynamics and the impact of prevention interventions. For example, such analyses have the potential to determine whether recently-infected individuals have acquired viruses circulating within or outside a given community. In addition, they have the potential to identify characteristics of chronically infected individuals that make their viruses likely to cluster with others circulating within a community. Such clustering can be related to the potential of such individuals to contribute to the spread of the virus, either directly through transmission to their partners or indirectly through further spread of HIV from those partners. Assessment of the extent to which individual (incident or prevalent) viruses are clustered within a community will be biased if only a subset of subjects are observed, especially if that subset is not representative of the entire HIV infected population. To address this concern, we develop a multiple imputation framework in which missing sequences are imputed based on a model for the diversification of viral genomes. The imputation method decreases the bias in clustering that arises from informative missingness. Data from a household survey conducted in a village in Botswana are used to illustrate these methods. We demonstrate that the multiple imputation approach reduces bias in the overall proportion of clustering due to the presence of missing observations.

[1]  Sergei L. Kosakovsky Pond,et al.  The global transmission network of HIV-1. , 2014, The Journal of infectious diseases.

[2]  T. Jukes CHAPTER 24 – Evolution of Protein Molecules , 1969 .

[3]  David Dunn,et al.  Molecular Phylodynamics of the Heterosexual HIV Epidemic in the United Kingdom , 2009, PLoS pathogens.

[4]  William R. Taylor,et al.  The rapid generation of mutation data matrices from protein sequences , 1992, Comput. Appl. Biosci..

[5]  Erik M. Volz,et al.  HIV-1 Transmission during Early Infection in Men Who Have Sex with Men: A Phylodynamic Analysis , 2013, PLoS medicine.

[6]  M. Nei,et al.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. , 2011, Molecular biology and evolution.

[7]  Erik M. Volz,et al.  Simple Epidemiological Dynamics Explain Phylogenetic Clustering of HIV from Patients with Recent Infection , 2012, PLoS Comput. Biol..

[8]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[9]  T. F. Rinke de Wit,et al.  HIV Type 1 transmission networks among men having sex with men and heterosexuals in Kenya. , 2014, AIDS research and human retroviruses.

[10]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[11]  D. Rubin Multiple Imputation After 18+ Years , 1996 .

[12]  Phylogenetic relatedness of HIV-1 donor and recipient populations. , 2013, The Journal of infectious diseases.

[13]  Michel Roger,et al.  Phylogenetic inferences on HIV-1 transmission: implications for the design of prevention and treatment interventions. , 2013, AIDS.

[14]  Matthias Cavassini,et al.  Molecular epidemiology reveals long-term changes in HIV type 1 subtype B transmission in Switzerland. , 2010, The Journal of infectious diseases.

[15]  Sergei L. Kosakovsky Pond,et al.  Phylodynamics of Infectious Disease Epidemics , 2009, Genetics.

[16]  Sikhulile Moyo,et al.  Impact of sampling density on the extent of HIV clustering. , 2014, AIDS research and human retroviruses.

[17]  Adeeba Kamarulzaman,et al.  AIDS Res Hum Retroviruses , 2006 .

[18]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[19]  V. De Gruttola,et al.  Sample size considerations in the design of cluster randomized trials of combination HIV prevention , 2014, Clinical trials.

[20]  Esther Fearnhill,et al.  Transmission Network Parameters Estimated From HIV Sequences for a Nationwide Epidemic , 2011, The Journal of infectious diseases.

[21]  E. Delatorre,et al.  Phylodynamics of HIV-1 Subtype C Epidemic in East Africa , 2012, PloS one.

[22]  S. Lagakos,et al.  HIV-1 Subtype C Phylodynamics in the Global Epidemic , 2010, Viruses.

[23]  Guy Baele,et al.  The Genealogical Population Dynamics of HIV-1 in a Large Transmission Chain: Bridging within and among Host Evolutionary Rates , 2014, PLoS Comput. Biol..

[24]  Joel O. Wertheim,et al.  Using HIV Transmission Networks to Investigate Community Effects in HIV Prevention Trials , 2011, PloS one.

[25]  Michel Roger,et al.  Transmission clustering drives the onward spread of the HIV epidemic among men who have sex with men in Quebec. , 2011, The Journal of infectious diseases.

[26]  S. Jeffery Evolution of Protein Molecules , 1979 .

[27]  Michel Roger,et al.  High rates of forward transmission events after acute/early HIV-1 infection. , 2007, The Journal of infectious diseases.

[28]  A. Rambaut,et al.  Episodic Sexual Transmission of HIV Revealed by Molecular Phylodynamics , 2008, PLoS medicine.

[29]  N. Madise,et al.  The effect of participant nonresponse on HIV prevalence estimates in a population-based survey in two informal settlements in Nairobi city , 2010, Population health metrics.

[30]  D. Cummings,et al.  The Role of Viral Introductions in Sustaining Community-Based HIV Epidemics in Rural Uganda: Evidence from Spatial Clustering, Phylogenetics, and Egocentric Transmission Models , 2014, PLoS medicine.

[31]  Trevor Bedford,et al.  Viral Phylodynamics , 2013, PLoS Comput. Biol..

[32]  Claudia Stein,et al.  Population health metrics: crucial inputs to the development of evidence for health policy , 2003, Population health metrics.

[33]  Ard van Sighem,et al.  Transmission networks of HIV-1 among men having sex with men in the Netherlands , 2010, AIDS.

[34]  Huldrych F. Günthard,et al.  Using an Epidemiological Model for Phylogenetic Inference Reveals Density Dependence in HIV Transmission , 2013, Molecular biology and evolution.

[35]  S. Little,et al.  Developing and evaluating comprehensive HIV infection control strategies: issues and challenges. , 2010, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[36]  Sikhulile Moyo,et al.  Phylogenetic Relatedness of Circulating HIV-1C Variants in Mochudi, Botswana , 2013, PloS one.

[37]  Joel O. Wertheim,et al.  Using HIV Networks to Inform Real Time Prevention Interventions , 2014, PloS one.

[38]  Tanja Stadler,et al.  Uncovering epidemiological dynamics in heterogeneous host populations using phylogenetic methods , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[39]  Lena Osterhagen,et al.  Multiple Imputation For Nonresponse In Surveys , 2016 .

[40]  Rui Wang,et al.  Linkage of Viral Sequences among HIV-Infected Village Residents in Botswana: Estimation of Linkage Rates in the Presence of Missing Data , 2014, PLoS Comput. Biol..

[41]  Thomas B. Kepler,et al.  Unselected Mutations in the Human Immunodeficiency Virus Type 1 Genome Are Mostly Nonsynonymous and Often Deleterious , 2004, Journal of Virology.