Comparative genomics-based investigation of resequencing targets in Vibrio fischeri: Focus on point miscalls and artefactual expansions

BackgroundSequence closure often represents the end-point of a genome project, without a system in place for subsequent improvement and refinement. Building on the genome project of Vibrio fischeri ES114, we used a comparative approach to identify and investigate genes that had a high likelihood of sequence error.ResultsComparison of the V. fischeri ES114 genome with that of conspecific strain MJ11 identified 82 target loci in ES114 as containing likely errors, and thus of high-priority for resequencing. Analysis of the targets identified 75 loci in which an error had occurred, resulting in the correction of 10,457 base pairs to generate the new ES114 genomic sequence. A majority of the inaccurate loci involved frameshift errors, correction of which fused adjacent ORFs. Although insertions/deletions are thought to be rare in microbial genome assemblies, fourteen of the loci contained extraneous sequence of over 300 bp, likely due to imperfect contig ends that were misassembled in tandem rather than as overlapping segments. Additionally we updated the entire genome annotation with 113 new features including previously uncalled protein-coding genes, regulatory RNA genes and operon leader peptides, and we analyzed the transcriptional apparatus encoded by ES114.ConclusionWe demonstrate that errors in microbial genome sequences, thought to largely be confined to point mutations, may also consist of other prevalent large-scale rearrangements such as insertions. Ongoing genome quality control and annotation programs are necessary to accompany technological advancements in data generation. These updates further advance V. fischeri as an important model for understanding intercellular communication and colonization of animal tissue.

[1]  K. Nealson,et al.  Bacterial bioluminescence: its control and ecological significance , 1979, Microbiological reviews.

[2]  Edward G. Ruby,et al.  Vibrio fischeri Uses Two Quorum-Sensing Systems for the Regulation of Early and Late Colonization Factors , 2005, Journal of bacteriology.

[3]  M. Soares,et al.  An annotated cDNA library of juvenile Euprymna scolopes with and without colonization by the symbiont Vibrio fischeri , 2006, BMC Genomics.

[4]  S. Gottesman The small RNA regulators of Escherichia coli: roles and mechanisms*. , 2004, Annual review of microbiology.

[5]  A. Wolfe,et al.  Diguanylate Cyclases Control Magnesium-Dependent Motility of Vibrio fischeri , 2006, Journal of bacteriology.

[6]  Susan E. Cohen,et al.  Y-family DNA polymerases in Escherichia coli. , 2007, Trends in microbiology.

[7]  Thomas Schiex,et al.  FrameD: a flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences , 2003, Nucleic Acids Res..

[8]  E. Ruby,et al.  Depressed light emission by symbiotic Vibrio fischeri of the sepiolid squid Euprymna scolopes , 1990, Journal of bacteriology.

[9]  T. Silhavy,et al.  Sensing external stress: watchdogs of the Escherichia coli cell envelope. , 2005, Current opinion in microbiology.

[10]  Inna Dubchak,et al.  The integrated microbial genomes (IMG) system , 2005, Nucleic Acids Res..

[11]  A Danchin,et al.  Detecting and analyzing DNA sequencing errors: toward a higher quality of the Bacillus subtilis genome sequence. , 1999, Genome research.

[12]  Unmi Kim,et al.  Bioluminescence in Vibrio fischeri is controlled by the redox‐responsive regulator ArcA , 2007, Molecular microbiology.

[13]  Nikos Kyrpides,et al.  The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata , 2007, Nucleic Acids Res..

[14]  Claudine Médigue,et al.  MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes , 2005, Nucleic Acids Res..

[15]  F. Blattner,et al.  Mauve: multiple alignment of conserved genomic sequence with rearrangements. , 2004, Genome research.

[16]  D. Cutler,et al.  Microarray-based resequencing of multiple Bacillus anthracis isolates , 2004, Genome Biology.

[17]  P. Dunlap,et al.  Phylogenetic analysis of host–symbiont specificity and codivergence in bioluminescent symbioses , 2007 .

[18]  Monica Riley,et al.  Escherichia coli K-12: a cooperatively developed annotation snapshot—2005 , 2006, Nucleic acids research.

[19]  Mihai Pop,et al.  Comparative Genome Sequencing for Discovery of Novel Polymorphisms in Bacillus anthracis , 2002, Science.

[20]  R. Kulkarni,et al.  Prediction of CsrA-regulating small RNAs in bacteria and their experimental verification in Vibrio fischeri , 2006, Nucleic acids research.

[21]  F. Thompson,et al.  The biology of vibrios. , 2006 .

[22]  M. Montgomery,et al.  Bacterial symbionts induce host organ morphogenesis during early postembryonic development of the squid Euprymna scolopes. , 1994, Development.

[23]  E. Ruby,et al.  Vibrio fischeri and its host: it takes two to tango. , 2006, Current opinion in microbiology.

[24]  N. Wingreen,et al.  The Small RNA Chaperone Hfq and Multiple Small RNAs Control Quorum Sensing in Vibrio harveyi and Vibrio cholerae , 2004, Cell.

[25]  L. McCarter,et al.  Motility and Chemotaxis , 2006 .

[26]  Sophie Brachat,et al.  Reinvestigation of the Saccharomyces cerevisiae genome annotation by comparison to the genome of a related fungus: Ashbya gossypii , 2003, Genome Biology.

[27]  M. Deutscher,et al.  The RNase Z Homologue Encoded by Escherichia coli elaC Gene Is RNase BN* , 2005, Journal of Biological Chemistry.

[28]  Fangfang Xia,et al.  The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation , 2006, Nucleic Acids Res..

[29]  The Sugar Phosphotransferase System of Vibrio fischeri Inhibits both Motility and Bioluminescence , 2007, Journal of bacteriology.

[30]  S. V. Nyholm,et al.  The winnowing: establishing the squid–vibrio symbiosis , 2004, Nature Reviews Microbiology.

[31]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[32]  E. Greenberg,et al.  Transcriptome Analysis of the Vibrio fischeri LuxR-LuxI Regulon , 2007, Journal of bacteriology.

[33]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[34]  C. Gross,et al.  Isolation and characterization of Escherichia coli mutants that lack the heat shock sigma factor sigma 32 , 1988, Journal of bacteriology.

[35]  E. Ruby,et al.  FlrA, a σ54-Dependent Transcriptional Activator in Vibrio fischeri, Is Required for Motility and Symbiotic Light-Organ Colonization , 2003, Journal of bacteriology.

[36]  Steven Salzberg,et al.  Beware of mis-assembled genomes , 2005, Bioinform..

[37]  M. Soares,et al.  Identifying Components of the NF-κB Pathway in the Beneficial Euprymna scolopes-Vibrio fischeri Light Organ Symbiosis , 2005, Applied and Environmental Microbiology.

[38]  B. Palsson,et al.  An evaluation of Comparative Genome Sequencing (CGS) by comparing two previously-sequenced bacterial genomes , 2007, BMC Genomics.

[39]  E. Greenberg,et al.  Complete genome sequence of Vibrio fischeri: a symbiotic bacterium with pathogenic congeners. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[40]  M. Nishiguchi,et al.  Counterillumination in the Hawaiian bobtail squid, Euprymna scolopes Berry (Mollusca: Cephalopoda) , 2004 .

[41]  E. Stabb,et al.  Characterization of pES213, a small mobilizable plasmid from Vibrio fischeri. , 2005, Plasmid.

[42]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[43]  Timothy J Donohue,et al.  A transcriptional response to singlet oxygen, a toxic byproduct of photosynthesis. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[44]  Narmada Thanki,et al.  CDD: a conserved domain database for interactive domain family analysis , 2006, Nucleic Acids Res..

[45]  G. Weinstock,et al.  Genomics and bacterial pathogenesis. , 2000, Emerging infectious diseases.

[46]  H. Mori,et al.  Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection , 2006, Molecular systems biology.

[47]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[48]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[49]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[50]  J. Das,et al.  Lack of umuDC gene functions in Vibrio cholerae cells. , 1989, Mutation research.

[51]  Olivier Poch,et al.  ICDS database: interrupted CoDing sequences in prokaryotic genomes , 2005, Nucleic Acids Res..

[52]  M. Deutscher,et al.  Identification and characterization of the Escherichia coli rbn gene encoding the tRNA processing enzyme RNase BN , 1996, Journal of bacteriology.

[53]  Owen White,et al.  The Comprehensive Microbial Resource , 2001, Nucleic Acids Res..

[54]  K. Visick,et al.  A novel, conserved cluster of genes promotes symbiotic colonization and σ54‐dependent biofilm formation by Vibrio fischeri , 2005, Molecular microbiology.

[55]  C. Gross,et al.  Multiple sigma subunits and the partitioning of bacterial transcription space. , 2003, Annual review of microbiology.

[56]  O. Poch,et al.  Interrupted coding sequences in Mycobacterium smegmatis: authentic mutations or sequencing errors? , 2007, Genome Biology.

[57]  Peter D. Karp,et al.  Multidimensional annotation of the Escherichia coli K-12 genome , 2007, Nucleic acids research.

[58]  J. W. Campbell,et al.  Experimental Determination and System Level Analysis of Essential Genes in Escherichia coli MG1655 , 2003, Journal of bacteriology.

[59]  E. Ruby,et al.  Two-Component Response Regulators of Vibrio fischeri: Identification, Mutagenesis, and Characterization , 2007, Journal of bacteriology.

[60]  S. Salzberg,et al.  DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae , 2000, Nature.

[61]  Aaron E. Darling,et al.  ASAP: a resource for annotating, curating, comparing, and disseminating genomic data , 2005, Nucleic Acids Res..