A two-phase strategy for detecting recombination in nucleotide sequences

Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We previously evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach to delineate breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.

[1]  Graham J. Etherington,et al.  Recombination Analysis Tool (RAT): a program for the high-throughput detection of recombination , 2005, Bioinform..

[2]  J. Hein Reconstructing evolution of sequences subject to recombination using parsimony. , 1990, Mathematical biosciences.

[3]  K. Crandall,et al.  Evaluation of methods for detecting recombination from DNA sequences: Computer simulations , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Mark A. Ragan,et al.  A word-oriented approach to alignment validation , 2005, Bioinform..

[5]  M. Suchard,et al.  Inferring Spatial Phylogenetic Variation Along Nucleotide Sequences , 2003 .

[6]  John Maynard Smith,et al.  Analyzing the mosaic structure of genes , 1992, Journal of Molecular Evolution.

[7]  Mark A. Ragan,et al.  Detecting recombination in evolving nucleotide sequences , 2006, BMC Bioinformatics.

[8]  N. Kleckner,et al.  Meiosis: how could it work? , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[9]  P. Hsieh,et al.  Homologous recombination proteins in prokaryotes and eukaryotes. , 1995, Annual review of genetics.

[10]  J. Hein,et al.  A simulation study of the reliability of recombination detection methods. , 2001, Molecular biology and evolution.

[11]  H. Kishino,et al.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA , 2005, Journal of Molecular Evolution.

[12]  D. Posada Evaluation of methods for detecting recombination from DNA sequences: empirical data. , 2002, Molecular biology and evolution.

[13]  David Posada,et al.  Automated phylogenetic detection of recombination using a genetic algorithm. , 2006, Molecular biology and evolution.

[14]  K. Kreuzer Interplay between DNA replication and recombination in prokaryotes. , 2005, Annual review of microbiology.

[15]  Vladimir N. Minin,et al.  Dual multiple change-point model leads to more accurate recombination detection , 2005, Bioinform..

[16]  Timothy J. Harlow,et al.  Highways of gene sharing in prokaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Jinko Graham,et al.  Stepwise detection of recombination breakpoints in sequence alignments , 2005, Bioinform..

[18]  G. Weiller Phylogenetic profiles: a graphical method for detecting genetic recombinations in homologous sequences. , 1998, Molecular biology and evolution.

[19]  D. Bryant,et al.  A Simple and Robust Statistical Test for Detecting the Presence of Recombination , 2006, Genetics.

[20]  Simon Easteal,et al.  A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequences , 1996, Comput. Appl. Biosci..

[21]  S. Sawyer Statistical tests for detecting gene conversion. , 1989, Molecular biology and evolution.

[22]  Timothy J. Harlow,et al.  A hybrid clustering approach to recognition of protein families in 114 microbial genomes , 2004, BMC Bioinformatics.

[23]  Andrew Rambaut,et al.  Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees , 1997, Comput. Appl. Biosci..