Inferring Spatial Phylogenetic Variation Along Nucleotide Sequences : A Multiple Changepoint Model

We develop a Bayesian multiple changepoint model to infer spatial phylogenetic variation (SPV) along aligned molecular sequence data. SPV occurs in sequences from organisms that have undergone biological recombination or when evolutionary rates and selective pressures vary along the sequences. This Bayesian approach permits estimation of uncertainty regarding recombination, the crossing-over locations, and all other model parameters. The model assumes that the sites along the data separate into an unknown number of contiguous segments, each with possibly different evolutionary relationships between organisms, evolutionary rates, and transition: transversion ratios. We develop a transition kernel, use reversible-jump Markov chain Monte Carlo to Ž t our model, and draw inference from both simulated and real data. Through simulation, we examine the minimal length recombinant segment that our model can detect for several levels of evolutionary divergence. We examine the entire genome of a reported human immunodeŽ ciency virus (HIV)-1 isolate, related to a purported recombinant virus thought to be the causative agent of an epidemic outbreak of HIV-1 infection among intravenous drug users in Russia. We Ž nd that regions of the genome differ in their evolutionary history and selective pressures. There is strong evidence for multiple crossovers along the genome and frequent shifts in selective pressure changes throughout the vif through env genes.

[1]  Joseph Felsenstein,et al.  The number of evolutionary trees , 1978 .

[2]  P. Sharp,et al.  Rates and dates of divergence between AIDS virus nucleotide sequences. , 1988, Molecular biology and evolution.

[3]  A. Hruszkewycz Evidence for mitochondrial DNA damage by lipid peroxidation. , 1988, Biochemical and biophysical research communications.

[4]  T Gojobori,et al.  Molecular phylogeny and evolution of primate mitochondrial DNA. , 1988, Molecular biology and evolution.

[5]  Wei-Shau Hu,et al.  Genetic consequences of packaging two RNA genomes in one retroviral particle: pseudodiploidy and high rate of genetic recombination. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[6]  H. Temin Sex and recombination in retroviruses. , 1991, Trends in genetics : TIG.

[7]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[8]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[9]  B. Larder,et al.  Retroviral recombination can lead to linkage of reverse transcriptase mutations that confer increased zidovudine resistance , 1995, Journal of virology.

[10]  P. Sharp,et al.  Recombination in HIV-1 , 1995, Nature.

[11]  Bin Yu,et al.  Regeneration in Markov chain samplers , 1995 .

[12]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[13]  D. Burke,et al.  Identification of breakpoints in intergenotypic recombinants of HIV type 1 by bootscanning. , 1995, AIDS research and human retroviruses.

[14]  Z. Yang,et al.  A space-time process model for the evolution of DNA sequences. , 1995, Genetics.

[15]  J. Felsenstein,et al.  A Hidden Markov Model approach to variation among sites in rate of evolution. , 1996, Molecular biology and evolution.

[16]  E. Holmes,et al.  A likelihood method for the detection of selection and recombination using nucleotide sequences. , 1997, Molecular biology and evolution.

[17]  H. Mitsuya,et al.  HIV-1 acquires resistance to two classes of antiviral drugs through homologous recombination. , 1997, Antiviral research.

[18]  B. Rannala,et al.  Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method. , 1997, Molecular biology and evolution.

[19]  P Barbosa,et al.  Molecular biology of HIV. , 1998, Clinics in podiatric medicine and surgery.

[20]  J. Lake,et al.  Optimally recovering rate variation information from genomes and sequences: pattern filtering. , 1998, Molecular biology and evolution.

[21]  Nick Goldman,et al.  Phylogenetic information and experimental design in molecular systematics , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[22]  M. Salminen,et al.  HIV‐1 genetic subtype A/B recombinant strain causing an explosive epidemic in injecting drug users in Kaliningrad , 1998, AIDS.

[23]  M. Salminen,et al.  HIV-1 genetic subtype A/B recombinant strain causing an explosive IDU epidemic in Kaliningrad , 1998 .

[24]  A. Jetzt,et al.  The Nature of Human Immunodeficiency Virus Type 1 Strand Transfers* , 1998, The Journal of Biological Chemistry.

[25]  S. Engelbrecht,et al.  Neutralization of HIV‐1 subtypes: Implications for vaccine formulations , 1998, Journal of medical virology.

[26]  B. Larget,et al.  Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees , 2000 .

[27]  A. Eyre-Walker,et al.  How clonal are human mitochondria? , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[28]  Brian T. Foley,et al.  Numbering Positions in HIV Relative to HXB 2 CG , 1999 .

[29]  E. Hagelberg,et al.  Evidence for mitochondrial DNA recombination in a human population of island Melanesia , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[30]  V. Macaulay,et al.  Mitochondrial DNA recombination-no need to panic , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[31]  F J Ayala,et al.  A new method for characterizing replacement rate variation in molecular sequences. Application of the Fourier and wavelet models to Drosophila and mammalian proteins. , 2000, Genetics.

[32]  Hani Doss,et al.  Phylogenetic Tree Construction Using Markov Chain Monte Carlo , 2000 .

[33]  S. Eshleman,et al.  Analysis of HIV type 1 protease and reverse transcriptase in antiretroviral drug-naive Ugandan adults. , 2000, AIDS research and human retroviruses.

[34]  R. Kass,et al.  Bayesian curve-fitting with free-knot splines , 2001 .

[35]  S. Osmanov,et al.  Analysis of HIV type 1 protease and reverse transcriptase sequences from Venezuela for drug resistance-associated mutations and subtype classification: a UNAIDS study. , 2001, AIDS research and human retroviruses.

[36]  M. Suchard,et al.  Bayesian selection of continuous-time Markov chain evolutionary models. , 2001, Molecular biology and evolution.

[37]  M. Suchard,et al.  Oh brother, where art thou? A Bayes factor test for recombination with uncertain heritage. , 2002, Systematic biology.

[38]  Freda Kemp,et al.  Mathematical and Statistical Methods for Genetic Analysis , 2003 .

[39]  John Maynard Smith,et al.  Analyzing the mosaic structure of genes , 1992, Journal of Molecular Evolution.

[40]  David L. Robertson,et al.  Recombination in AIDS viruses , 1995, Journal of Molecular Evolution.

[41]  J. Hein A heuristic method to reconstruct the history of sequences subject to recombination , 1993, Journal of Molecular Evolution.

[42]  Ziheng Yang,et al.  Evaluation of several methods for estimating phylogenetic trees when substitution rates differ over nucleotide sites , 1995, Journal of Molecular Evolution.

[43]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[44]  Allan C. Wilson,et al.  Mitochondrial DNA sequences of primates: Tempo and mode of evolution , 2005, Journal of Molecular Evolution.

[45]  E. Holmes,et al.  Recombination between sequences of hepatitis B virus from different genotypes , 1996, Journal of Molecular Evolution.