CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes

Recent developments in sequencing technologies have given the opportunity to sequence many bacterial genomes with limited cost and labor, compared to previous techniques. However, a limiting step of genome sequencing is the finishing process, needed to infer the relative position of each contig and close sequencing gaps. An additional degree of complexity is given by bacterial species harboring more than one replicon, which are not contemplated by the currently available programs. The availability of a large number of bacterial genomes allows geneticists to use complete genomes (possibly from the same species) as templates for contigs mapping.Here we present CONTIGuator, a software tool for contigs mapping over a reference genome which allows the visualization of a map of contigs, underlining loss and/or gain of genetic elements and permitting to finish multipartite genomes. The functionality of CONTIGuator was tested using four genomes, demonstrating its improved performances compared to currently available programs.Our approach appears efficient, with a clear visualization, allowing the user to perform comparative structural genomics analysis on draft genomes. CONTIGuator is a Python script for Linux environments and can be used on normal desktop machines and can be downloaded from http://contiguator.sourceforge.net.

[1]  J. Heesemann,et al.  Complete Genome Sequence of Yersinia enterocolitica subsp. palearctica Serogroup O:3 , 2011, Journal of bacteriology.

[2]  Thomas M. Keane,et al.  ABACAS: algorithm-based automatic contiguation of assembled sequences , 2009, Bioinform..

[3]  S. Ehrlich,et al.  Low-redundancy sequencing of the entire Lactococcus lactis IL1403 genome , 1999, Antonie van Leeuwenhoek.

[4]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[5]  Natalia N. Ivanova,et al.  Exploring the symbiotic pangenome of the nitrogen-fixing bacterium Sinorhizobium meliloti , 2011, BMC Genomics.

[6]  Ronald W. Davis,et al.  The Composite Genome of the Legume Symbiont Sinorhizobium meliloti , 2001, Science.

[7]  Matthew Berriman,et al.  Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database , 2008, Bioinform..

[8]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[9]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[10]  Nikos Kyrpides,et al.  The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide , 2005, Nucleic Acids Res..

[11]  Lisa C. Crossman,et al.  The Complete Genome Sequence and Comparative Genome Analysis of the High Pathogenicity Yersinia enterocolitica Strain 8081 , 2006, PLoS genetics.

[12]  J. Claverie,et al.  Brucella microti: the genome sequence of an emerging pathogen , 2009, BMC Genomics.

[13]  S. Xiao,et al.  Understanding Streptococcus suis serotype 2 infection in pigs through a transcriptional approach , 2011, BMC Genomics.

[14]  J. García-Rodríguez,et al.  Fluoroquinolone-resistant Brucella melitensis mutants obtained in vitro. , 2009, International journal of antimicrobial agents.

[15]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[16]  Fangqing Zhao,et al.  PGA4genomics for comparative genome assembly based on genetic algorithm optimization. , 2009, Genomics.

[17]  R. Siezen,et al.  Complete Genome Sequence of Lactococcus lactis subsp. lactis KF147, a Plant-Associated Lactic Acid Bacterium , 2010, Journal of bacteriology.

[18]  Oscar P. Kuipers,et al.  Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies , 2005, Nucleic Acids Res..

[19]  Daniel H. Huson,et al.  OSLay: optimal syntenic layout of unfinished assemblies , 2007, Bioinform..

[20]  Maido Remm,et al.  Enhancements and modifications of primer design program Primer3 , 2007, Bioinform..

[21]  Pushkala Jayaraman,et al.  A computational genomics pipeline for prokaryotic sequencing projects , 2010, Bioinform..