Bacillus anthracis genome organization in light of whole transcriptome sequencing

Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computational predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.

[1]  Brian D. Ondov,et al.  Structure and Complexity of a Bacterial Transcriptome , 2009, Journal of bacteriology.

[2]  Mark Gerstein,et al.  Revisiting the codon adaptation index from a whole-genome perspective: analyzing the relationship between gene expression and codon occurrence in yeast using a variety of models. , 2003, Nucleic acids research.

[3]  F. Crick Codon--anticodon pairing: the wobble hypothesis. , 1966, Journal of molecular biology.

[4]  P. Sharp,et al.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. , 1987, Nucleic acids research.

[5]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[6]  Steven Salzberg,et al.  OperonDB: a comprehensive database of predicted operons in microbial genomes , 2008, Nucleic Acids Res..

[7]  H. Margalit,et al.  Hierarchy of sequence-dependent features associated with prokaryotic translation. , 2003, Genome research.

[8]  Alessandra Carbone,et al.  Codon adaptation index as a measure of dominating codon bias , 2003, Bioinform..

[9]  Stephen J Freeland,et al.  A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes , 2001, Genome Biology.

[10]  P. Sharp,et al.  Variation in the strength of selected codon usage bias among bacteria , 2005, Nucleic acids research.

[11]  S. Salzberg,et al.  Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake , 2007, Genome Biology.

[12]  L. Wernisch,et al.  Solving the riddle of codon usage preferences: a test for translational selection. , 2004, Nucleic acids research.

[13]  M. Borodovsky,et al.  Heuristic approach to deriving models for gene finding. , 1999, Nucleic acids research.

[14]  M. Borodovsky,et al.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. , 2001, Nucleic acids research.

[15]  Eric C. Rouchka,et al.  Gibbs Recursive Sampler: finding transcription factor binding sites , 2003, Nucleic Acids Res..

[16]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[17]  Alison K. Hottes,et al.  Codon usage between genomes is constrained by genome-wide mutational processes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  J. Patrick Fitch,et al.  National Biodefense Analysis and Countermeasures Center , 2011 .

[19]  S. Osawa,et al.  The guanine and cytosine content of genomic DNA and bacterial evolution. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[20]  S Karlin,et al.  Codon usages in different gene classes of the Escherichia coli genome , 1998, Molecular microbiology.

[21]  Brian D. Ondov,et al.  Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications , 2008, Bioinform..

[22]  F. Wright The 'effective number of codons' used in a gene. , 1990, Gene.

[23]  Guorong Chen,et al.  CodonO: codon usage bias analysis within and across genomes , 2007, Nucleic Acids Res..

[24]  Wenhan Zhu,et al.  Assessment of Gene Annotation Accuracy by Inferring Transcripts from RNA-Seq , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.