Scaffolding of Ancient Contigs and Ancestral Reconstruction in a Phylogenetic Framework

Ancestral genome reconstruction is an important task to analyze the evolution of genomes. Recent progress in sequencing ancient DNA led to the publication of so-called paleogenomes and allows the integration of this sequencing data in genome evolution analysis. However, the de novo assembly of ancient genomes is usually fragmented due to DNA degradation over time among others. Integrated phylogenetic assembly addresses the issue of genome fragmentation in the ancient DNA assembly while aiming to improve the reconstruction of all ancient genomes in the phylogeny simultaneously. The fragmented assembly of the ancient genome can be represented as an assembly graph, indicating contradicting ordering information of contigs. In this setting, our approach is to compare the ancient data with extant finished genomes. We generalize a reconstruction approach minimizing the Single-Cut-or-Join rearrangement distance towards multifurcating trees and include edge lengths to improve the reconstruction in practice. This results in a polynomial time algorithm that includes additional ancient DNA data at one node in the tree, resulting in consistent reconstructions of ancestral genomes.

[1]  W. Fitch Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology , 1971 .

[2]  J. Hartigan MINIMUM MUTATION FITS TO A GIVEN TREE , 1973 .

[3]  David Sankoff,et al.  Locating the vertices of a steiner tree in an arbitrary metric space , 1975, Math. Program..

[4]  P. Pevzner,et al.  Genome-scale evolution: reconstructing gene orders in the ancestral species. , 2002, Genome research.

[5]  Annie Chateau,et al.  Reconstructing Ancestral Gene Orders Using Conserved Intervals , 2004, WABI.

[6]  D. Raoult,et al.  Palaeomicrobiology: current issues and perspectives , 2005, Nature Reviews Microbiology.

[7]  Bernard B. Suh,et al.  Reconstructing contiguous regions of an ancestral genome. , 2006, Genome research.

[8]  David Sankoff,et al.  Multichromosomal median and halving problems under different genomic distances , 2009, BMC Bioinformatics.

[9]  Cédric Chauve,et al.  A Methodological Framework for the Reconstruction of Contiguous Regions of Ancestral Genomes and Its Application to Mammalian Genomes , 2008, PLoS Comput. Biol..

[10]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[11]  Jens Stoye,et al.  A Unified Approach for Reconstructing Ancient Gene Clusters , 2009, TCBB.

[12]  Mathieu Blanchette,et al.  Reconstruction of Ancestral Genome Subject to Whole Genome Duplication, Speciation, Rearrangement and Loss , 2010, WABI.

[13]  João Meidanis,et al.  SCJ: A Breakpoint-Like Distance that Simplifies Several Rearrangement Problems , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[14]  David Sankoff,et al.  On the PATHGROUPS approach to rapid small phylogeny , 2011, BMC Bioinformatics.

[15]  Matthias Meyer,et al.  A draft genome of Yersinia pestis from victims of the Black Death , 2011, Nature.

[16]  Ján Manuch,et al.  Linearization of ancestral multichromosomal genomes , 2012, BMC Bioinformatics.

[17]  João Meidanis,et al.  Rearrangement-Based Phylogeny Using the Single-Cut-or-Join Operation , 2013, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Miklós Csűrös,et al.  How to Infer Ancestral Genome Features by Parsimony: Dynamic Programming over an Evolutionary Tree , 2013 .

[19]  Cédric Chauve,et al.  FPSAC: fast phylogenetic scaffolding of ancient contigs , 2013, Bioinform..

[20]  Miklós Csürös,et al.  How to Infer Ancestral Genome Features by Parsimony: Dynamic Programming over an Evolutionary Tree , 2013, Models and Algorithms for Genome Evolution.

[21]  Philip L. F. Johnson,et al.  Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse , 2013, Nature.

[22]  Vineet Bafna,et al.  Cerulean: A Hybrid Assembly Using High Throughput Short and Long Reads , 2013, WABI.

[23]  Anders Krogh,et al.  Reconstructing genome evolution in historic samples of the Irish potato famine pathogen , 2013, Nature Communications.

[24]  Kay Nieselt,et al.  Genome-Wide Comparison of Medieval and Modern Mycobacterium leprae , 2013, Science.

[25]  Jens Stoye,et al.  Scaffolding of Ancient Contigs and Ancestral Reconstruction in a Phylogenetic Framework , 2014, BSB.

[26]  Brian J. Raney,et al.  Ragout—a reference-assisted assembly tool for bacterial genomes , 2014, Bioinform..

[27]  Cédric Chauve,et al.  The SCJ Small Parsimony Problem for Weighted Gene Adjacencies , 2016, ISBRA.

[28]  Daniel Doerr,et al.  Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes , 2017, Microbial genomics.