Probabilistic Reconstruction of Ancestral Gene Orders with Insertions and Deletions

Changes of gene orderings have been extensively used as a signal to reconstruct phylogenies and ancestral genomes. Inferring the gene order of an extinct species has a wide range of applications, including the potential to reveal more detailed evolutionary histories, to determine gene content and ordering, and to understand the consequences of structural changes for organismal function and species divergence. In this study, we propose a new adjacency-based method, PMAG + , to infer ancestral genomes under a more general model of gene evolution involving gene insertions and deletions (indels), in addition to gene rearrangements. PMAG + improves on our previous method PMAG by developing a new approach to infer ancestral gene contents and reducing the adjacency assembly problem to an instance of TSP. We designed a series of experiments to extensively validate PMAG + and compared the results with the most recent and comparable method GapAdj. According to the results, ancestral gene contents predicted by PMAG + coincides highly with the actual contents with error rates less than 1 percent. Under various degrees of indels, PMAG + consistently achieves more accurate prediction of ancestral gene orders and at the same time, produces contigs very close to the actual chromosomes.

[1]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[2]  David Haussler,et al.  The infinite sites model of genome evolution , 2008, Proceedings of the National Academy of Sciences.

[3]  Richard Friedberg,et al.  Efficient sorting of genomic permutations by translocation, inversion and block interchange , 2005, Bioinform..

[4]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[5]  Christos A. Ouzounis,et al.  GeneTRACE - Reconstruction of Gene Content of Ancestral Species , 2003, Bioinform..

[6]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[7]  Gergely J. Szöllosi,et al.  Evolution of gene neighborhoods within reconciled phylogenies , 2012, Bioinform..

[8]  Jian Ma A probabilistic framework for inferring ancestral genomic orders , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9]  Jijun Tang,et al.  Phylogenetic reconstruction with gene rearrangements and gene losses , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[10]  David A. Bader,et al.  A New Implmentation and Detailed Study of Breakpoint Analysis , 2000, Pacific Symposium on Biocomputing.

[11]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[12]  Jijun Tang,et al.  Phylogenetic reconstruction from arbitrary gene-order data , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[13]  Jijun Tang,et al.  Improving genome rearrangement phylogeny using sequence-style parsimony , 2005, Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05).

[14]  Jijun Tang,et al.  Reconstructing Ancestral Genomic Orders Using Binary Encoding and Probabilistic Models , 2013, ISBRA.

[15]  M. P. Cummings,et al.  PAUP* Phylogenetic analysis using parsimony (*and other methods) Version 4 , 2000 .

[16]  Yu Lin,et al.  Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes , 2012, Pacific Symposium on Biocomputing.

[17]  Kevin P. Byrne,et al.  Additions, Losses, and Rearrangements on the Evolutionary Route from a Reconstructed Ancestor to the Modern Saccharomyces cerevisiae Genome , 2009, PLoS genetics.

[18]  Mathieu Blanchette,et al.  A flexible ancestral genome reconstruction method based on gapped adjacencies , 2012, BMC Bioinformatics.