Scaffolding and validation of bacterial genome assemblies using optical restriction maps

Motivation: New, high-throughput sequencing technologies have made it feasible to cheaply generate vast amounts of sequence information from a genome of interest. The computational reconstruction of the complete sequence of a genome is complicated by specific features of these new sequencing technologies, such as the short length of the sequencing reads and absence of mate-pair information. In this article we propose methods to overcome such limitations by incorporating information from optical restriction maps. Results: We demonstrate the robustness of our methods to sequencing and assembly errors using extensive experiments on simulated datasets. We then present the results obtained by applying our algorithms to data generated from two bacterial genomes Yersinia aldovae and Yersinia kristensenii. The resulting assemblies contain a single scaffold covering a large fraction of the respective genomes, suggesting that the careful use of optical maps can provide a cost-effective framework for the assembly of genomes. Availability: The tools described here are available as an open-source package at ftp://ftp.cbcb.umd.edu/pub/software/soma Contact: mpop@umiacs.umd.edu

[1]  H. Smith,et al.  Restriction endonucleases in the analysis and restructuring of dna molecules. , 1975, Annual review of biochemistry.

[2]  D. Schwartz,et al.  Optical mapping: a novel, single-molecule approach to genomic analysis. , 1995, Genome research.

[3]  B. Goldman,et al.  Optical mapping as a routine tool for bacterial genome sequence finishing , 2007, BMC Genomics.

[4]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[5]  Frits C. R. Spieksma,et al.  Interval scheduling: A survey , 2007 .

[6]  David C. Schwartz,et al.  Genomics via Optical Mapping III: Contiging Genomic DNA , 1998, ISMB.

[7]  E. Dimalanta,et al.  A Whole-Genome Shotgun Optical Map of Yersinia pestis Strain KIM , 2002, Applied and Environmental Microbiology.

[8]  Jessica Severin,et al.  Shotgun optical mapping of the entire Leishmania major Friedlin genome. , 2004, Molecular and biochemical parasitology.

[9]  Richard M. Karp,et al.  The restriction scaffold problem , 2002, RECOMB '02.

[10]  William Nelson,et al.  Locating sequence on FPC maps and selecting a minimal tiling path. , 2003, Genome research.

[11]  S. Salzberg,et al.  Hierarchical scaffolding with Bambus. , 2003, Genome research.

[12]  Reuven Bar-Yehuda,et al.  A unified approach to approximating resource allocation and scheduling , 2001, JACM.

[13]  David C. Schwartz,et al.  Whole-Genome Shotgun Optical Mapping of Rhodospirillum rubrum , 2004, Applied and Environmental Microbiology.

[14]  Salvatore Paxia,et al.  Genomics via Optical Mapping IV: Sequence Validation via Optical Map Matching , 2001 .

[15]  Yi Yang,et al.  Alignment of Optical Maps , 2005, RECOMB.

[16]  Mihai Pop,et al.  Shotgun Sequence Assembly , 2004, Adv. Comput..

[17]  A. Halpern,et al.  A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Carol Soderlund,et al.  FPC: a system for building contigs from restriction fingerprinted clones , 1997, Comput. Appl. Biosci..

[19]  David C. Schwartz,et al.  Genomics via Optical Mapping II: Ordered Restriction Maps , 1997, J. Comput. Biol..