Automated analysis of phylogenetic clusters

BackgroundAs sequence data sets used for the investigation of pathogen transmission patterns increase in size, automated tools and standardized methods for cluster analysis have become necessary. We have developed an automated Cluster Picker which identifies monophyletic clades meeting user-input criteria for bootstrap support and maximum genetic distance within large phylogenetic trees. A second tool, the Cluster Matcher, automates the process of linking genetic data to epidemiological or clinical data, and matches clusters between runs of the Cluster Picker.ResultsWe explore the effect of different bootstrap and genetic distance thresholds on clusters identified in a data set of publicly available HIV sequences, and compare these results to those of a previously published tool for cluster identification. To demonstrate their utility, we then use the Cluster Picker and Cluster Matcher together to investigate how clusters in the data set changed over time. We find that clusters containing sequences from more than one UK location at the first time point (multiple origin) were significantly more likely to grow than those representing only a single location.ConclusionsThe Cluster Picker and Cluster Matcher can rapidly process phylogenetic trees containing tens of thousands of sequences. Together these tools will facilitate comparisons of pathogen transmission dynamics between studies and countries.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Jiang Fan,et al.  Phylogeography of the Spring and Fall Waves of the H1N1/09 Pandemic Influenza Virus in the United States , 2010, Journal of Virology.

[4]  A. Rambaut,et al.  Episodic Sexual Transmission of HIV Revealed by Molecular Phylodynamics , 2008, PLoS medicine.

[5]  Ann M. Dennis,et al.  Characterizing HIV transmission networks across the United States. , 2012, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[6]  C. Birch,et al.  Phylogenetic Investigation of Transmission Pathways of Drug-Resistant HIV-1 Utilizing Pol Sequences Derived From Resistance Genotyping , 2008, Journal of acquired immune deficiency syndromes.

[7]  Edward C. Holmes,et al.  Rates of Molecular Evolution in RNA Viruses: A Quantitative Phylogenetic Analysis , 2002, Journal of Molecular Evolution.

[8]  David Dunn,et al.  Molecular Phylodynamics of the Heterosexual HIV Epidemic in the United Kingdom , 2009, PLoS pathogens.

[9]  Stéphane Hué,et al.  HIV-1 pol gene variation is sufficient for reconstruction of transmissions in the era of antiretroviral therapy , 2004, AIDS.

[10]  M. Pirmohamed,et al.  Emergence and global spread of epidemic healthcare-associated Clostridium difficile , 2012, Nature Genetics.

[11]  D. Richman,et al.  2022 update of the drug resistance mutations in HIV-1. , 2022, Topics in antiviral medicine.

[12]  Y. Guan,et al.  Molecular epidemiology of the novel coronavirus that causes severe acute respiratory syndrome , 2004, The Lancet.

[13]  O. Pybus,et al.  Correlating viral phenotypes with phylogeny: accounting for phylogenetic uncertainty. , 2008, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[14]  Anne M Johnson,et al.  Determinants of HIV-1 transmission in men who have sex with men: a combined clinical, epidemiological and phylogenetic approach , 2010, AIDS.

[15]  Sergei L. Kosakovsky Pond,et al.  The global transmission network of HIV-1. , 2014, The Journal of infectious diseases.

[16]  S. Pillai,et al.  Inferring HIV Transmission Dynamics from Phylogenetic Sequence Relationships , 2008, PLoS medicine.

[17]  Linos Vandekerckhove,et al.  Epidemiological study of phylogenetic transmission clusters in a local HIV-1 epidemic reveals distinct differences between subtype B and non-B infections , 2010, BMC infectious diseases.

[18]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[19]  Ard van Sighem,et al.  Transmission networks of HIV-1 among men having sex with men in the Netherlands , 2010, AIDS.

[20]  Sergei L. Kosakovsky Pond,et al.  Associations between phylogenetic clustering and HLA profile among HIV-infected individuals in San Diego, California. , 2012, The Journal of infectious diseases.

[21]  C. Archibald,et al.  Longitudinal Phylogenetic Surveillance Identifies Distinct Patterns of Cluster Dynamics , 2010, Journal of acquired immune deficiency syndromes.

[22]  L. Real,et al.  Wave-Like Spread of Ebola Zaire , 2005, PLoS biology.

[23]  Edward C Holmes,et al.  Phylogenetic relationships and differential selection pressures among genotypes of dengue-2 virus. , 2002, Virology.

[24]  Michel Roger,et al.  High rates of forward transmission events after acute/early HIV-1 infection. , 2007, The Journal of infectious diseases.

[25]  Martin Fisher,et al.  Transmission of HIV-1 during primary infection: relationship to sexual risk and sexually transmitted infections , 2005, AIDS.

[26]  Olivier Gascuel,et al.  Searching for virus phylotypes , 2013, Bioinform..

[27]  Korbinian Strimmer,et al.  APE: Analyses of Phylogenetics and Evolution in R language , 2004, Bioinform..

[28]  P H Harvey,et al.  Revealing the history of infectious disease epidemics through phylogenetic trees. , 1995, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[29]  I. Hoffman,et al.  Spatial distribution of HIV prevalence and incidence among injection drugs users in St Petersburg: implications for HIV transmission , 2008, AIDS.

[30]  Esther Fearnhill,et al.  Transmission Network Parameters Estimated From HIV Sequences for a Nationwide Epidemic , 2011, The Journal of infectious diseases.

[31]  Paul Sandstrom,et al.  Transmission Patterns of HIV and Hepatitis C Virus among Networks of People Who Inject Drugs , 2011, PloS one.

[32]  Huldrych F Günthard,et al.  2011 update of the drug resistance mutations in HIV-1. , 2011, Topics in antiviral medicine.

[33]  M. Kaku,et al.  Characterization of MRSA transmission in an emergency medical center by sequence analysis of the 3′-end region of the coagulase gene , 2001, Journal of infection and chemotherapy : official journal of the Japan Society of Chemotherapy.

[34]  Maurizio Zazzi,et al.  A novel methodology for large-scale phylogeny partition , 2011, Nature communications.