Ancestral reconstruction under weighted maximum matching

Ancestral genome reconstruction has attracted increasing interests from both biologists and computer scientists. It has been conducted using various evolutionary models ever since comparative genomics moved from sequence data to gene order data. We propose a Flexible Ancestral Reconstruction Model, FARM, based on the maximum likelihood and weighted maximum matching algorithms, to infer ancestral gene orders. This will accommodate various evolutionary scenarios, including not only genomic rearrangements, but also insertion/deletions (indels), segment duplications, and whole genome duplications. We evaluate this work by using various simulated evolution experiments while comparing FARM to existing methods, like InferCarsPro, GASTS and PMAG++. FARM shows significant improvement in running time and the final assembling process and, therefore, can be used in large-scale real biological data ancestral inference.

[1]  Alexandros Stamatakis,et al.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models , 2006, Bioinform..

[2]  Harold Neil Gabow,et al.  Implementation of algorithms for maximum matching on nonbipartite graphs , 1973 .

[3]  Jijun Tang,et al.  Improving genome rearrangement phylogeny using sequence-style parsimony , 2005, Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05).

[4]  Glenn Tesler,et al.  Efficient algorithms for multichromosomal genome rearrangements , 2002, J. Comput. Syst. Sci..

[5]  J. Edmonds Paths, Trees, and Flowers , 1965, Canadian Journal of Mathematics.

[6]  Jijun Tang,et al.  Reconstructing Ancestral Genomic Orders Using Binary Encoding and Probabilistic Models , 2013, ISBRA.

[7]  P. Pevzner,et al.  Breakpoint graphs and ancestral genome reconstructions. , 2009, Genome research.

[8]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[9]  Jun Zhou,et al.  Probabilistic Reconstruction of Ancestral Gene Orders with Insertions and Deletions , 2014, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Yu Lin,et al.  Maximum Likelihood Phylogenetic Reconstruction from High-Resolution Whole-Genome Data and a Tree of 68 Eukaryotes , 2012, Pacific Symposium on Biocomputing.

[11]  Mathieu Blanchette,et al.  A flexible ancestral genome reconstruction method based on gapped adjacencies , 2012, BMC Bioinformatics.

[12]  David A. Bader,et al.  A New Implmentation and Detailed Study of Breakpoint Analysis , 2000, Pacific Symposium on Biocomputing.

[13]  Fan Zhang,et al.  Improving Protein Localization Prediction Using Amino Acid Group Based Physichemical Encoding , 2009, BICoB.

[14]  Bernard M. E. Moret,et al.  GASTS: Parsimony Scoring under Rearrangements , 2011, WABI.

[15]  M. Nei,et al.  A new method of inference of ancestral nucleotide and amino acid sequences. , 1995, Genetics.

[16]  Jian Ma A probabilistic framework for inferring ancestral genomic orders , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).