论文信息 - Pan-Genomes and de Bruijn Graphs Seminar Report

Pan-Genomes and de Bruijn Graphs Seminar Report

This report is based on the paper “Graphical pan-genome analysis with compressed suffix trees and the Burrows–Wheeler transform” (Baier et al. 2016). The pan-genome of a population is a collection of genomic sequences of individuals in this population as well as genetic variations. Marcus et al. (2014) proposed the compressed de Bruijn graph as a suitable datastructure for the pan-genome and introduced the splitMEM algorithm to construct this graph. Baier et al. (2016) improved the splitMEM algorithm and developed two algorithms that outperformed splitMEM significantly. Ilia Minkin et al. (2016) devised a scalable, low-memory algorithm, called TwoPaCo, that was even more efficient.

Sebastian Gieße

[1] Paul Medvedev,et al. TwoPaCo: an efficient algorithm to build the compacted de Bruijn graph from many complete genomes , 2016, Bioinform..

[2] D. J. Wheeler,et al. A Block-sorting Lossless Data Compression Algorithm , 1994 .

[3] Michael C. Schatz,et al. SplitMEM: a graphical algorithm for pan-genome analysis with suffix skips , 2014, Bioinform..

[4] Enno Ohlebusch,et al. The Enhanced Suffix Array and Its Applications to Genome Analysis , 2002, WABI.

[5] Chung Keung Poon,et al. Opportunistic data structures for range queries , 2005, COCOON.

[6] Dan Gusfield. Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[7] Giovanni Manzini,et al. Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[8] Benjamin J. Raphael,et al. A novel method for multiple alignment of sequences with repeated and shuffled elements. , 2004, Genome research.

[9] Enno Ohlebusch,et al. Graphical pan-genome analysis with compressed suffix trees and the Burrows-Wheeler transform , 2016, Bioinform..

[10] Nikolay Vyahhi,et al. Sibelia: A Scalable and Comprehensive Synteny Block Generation Tool for Closely Related Microbial Genomes , 2013, WABI.

[11] Alistair Moffat,et al. From Theory to Practice: Plug and Play with Succinct Data Structures , 2013, SEA.