Transcriptome dynamics-based operon prediction in prokaryotes

BackgroundInferring operon maps is crucial to understanding the regulatory networks of prokaryotic genomes. Recently, RNA-seq based transcriptome studies revealed that in many bacterial species the operon structure vary with the change of environmental conditions. Therefore, new computational solutions that use both static and dynamic data are necessary to create condition specific operon predictions.ResultsIn this work, we propose a novel classification method that integrates RNA-seq based transcriptome profiles with genomic sequence features to accurately identify the operons that are expressed under a measured condition. The classifiers are trained on a small set of confirmed operons and then used to classify the remaining gene pairs of the organism studied. Finally, by linking consecutive gene pairs classified as operons, our computational approach produces condition-dependent operon maps. We evaluated our approach on various RNA-seq expression profiles of the bacteria Haemophilus somni, Porphyromonas gingivalis, Escherichia coli and Salmonella enterica. Our results demonstrate that, using features depending on both transcriptome dynamics and genome sequence characteristics, we can identify operon pairs with high accuracy. Moreover, the combination of DNA sequence and expression data results in more accurate predictions than each one alone.ConclusionWe present a computational strategy for the comprehensive analysis of condition-dependent operon maps in prokaryotes. Our method can be used to generate condition specific operon maps of many bacterial organisms for which high-resolution transcriptome data is available.

[1]  George Karypis,et al.  Transcriptome dynamics-based operon prediction and verification in Streptomyces coelicolor , 2007, Nucleic acids research.

[2]  Ying Xu,et al.  Operon prediction in Pyrococcus furiosus , 2006 .

[3]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[4]  B. Taboada,et al.  High accuracy operon prediction method based on STRING database scores , 2010, Nucleic acids research.

[5]  Shigeaki Harayama,et al.  Codon usage patterns suggest independent evolution of two catabolic operons on toluene-degradative plasmid TOL pWW0 of Pseudomonas putida , 1994, Journal of Molecular Evolution.

[6]  A. N. Spiridonov,et al.  Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. , 2002, Nucleic acids research.

[7]  A. Oshlack,et al.  Transcript length bias in RNA-seq data confounds systems biology , 2009, Biology Direct.

[8]  Jon Beckwith,et al.  Genetic Screen Yields Mutations in Genes Encoding All Known Components of the Escherichia coli Signal Recognition Particle Pathway , 2002, Journal of bacteriology.

[9]  O. Kuipers,et al.  PePPER: a webserver for prediction of prokaryote promoter elements and regulons , 2012, BMC Genomics.

[10]  M. Kanehisa,et al.  A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. , 2000, Nucleic acids research.

[11]  S. C. Rison,et al.  A universally applicable method of operon map prediction on minimally annotated genomes using conserved genomic context , 2005, Nucleic acids research.

[12]  J. Monod,et al.  Genetic regulatory mechanisms in the synthesis of proteins. , 1961, Journal of molecular biology.

[13]  Suresh D. Pillai,et al.  Obacunone Represses Salmonella Pathogenicity Islands 1 and 2 in an envZ-Dependent Fashion , 2012, Applied and Environmental Microbiology.

[14]  C. Arias,et al.  Resistance or decreased susceptibility to glycopeptides, daptomycin, and linezolid in methicillin-resistant Staphylococcus aureus. , 2010, Current opinion in pharmacology.

[15]  E. Rollo,et al.  Nucleotide sequence of the secA gene and secA(Ts) mutations preventing protein export in Escherichia coli , 1988, Journal of bacteriology.

[16]  Katsumi Isono,et al.  Cloning and nucleotide sequencing of the genes for ribosomal proteins S9 (rpsI) and L13 (rplM) of Escherichia coli , 2004, Molecular and General Genetics MGG.

[17]  David R. Haynor,et al.  Identifying operons and untranslated regions of transcripts using Escherichia coli RNA expression analysis , 2002, ISMB.

[18]  Fatih Ozsolak,et al.  RNA sequencing: advances, challenges and opportunities , 2011, Nature Reviews Genetics.

[19]  T Yada,et al.  Modeling and predicting transcriptional units of Escherichia coli genes using hidden Markov models. , 1999, Bioinformatics.

[20]  Ying Xu,et al.  DOOR: a database for prokaryotic operons , 2008, Nucleic Acids Res..

[21]  Chiara Sabatti,et al.  Co-expression pattern from DNA microarray experiments as a tool for operon prediction , 2002, Nucleic Acids Res..

[22]  Temple F. Smith,et al.  Operons in Escherichia coli: genomic analyses and predictions. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[23]  J. Friesen,et al.  Organization of genes in the four minute region of the Escherichia coli chromosome: Evidence that rpsB and tsf are Co-transcribed , 2004, Molecular and General Genetics MGG.

[24]  BMC Bioinformatics , 2005 .

[25]  Yu Qiu,et al.  Predicting bacterial transcription units using sequence and expression data , 2003, ISMB.

[26]  S. Salzberg,et al.  Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake , 2007, Genome Biology.

[27]  Paulo Cortez,et al.  Data Mining with Neural Networks and Support Vector Machines Using the R/rminer Tool , 2010, ICDM.

[28]  Bindu Nanduri,et al.  RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336” , 2012, PloS one.

[29]  A. Valencia,et al.  Conserved Clusters of Functionally Related Genes in Two Bacterial Genomes , 1997, Journal of Molecular Evolution.

[30]  Ying Xu,et al.  Operon prediction using both genome-specific and general genomic information , 2006, Nucleic acids research.

[31]  Wen-Han Yu,et al.  Comprehensive Transcriptome Analysis of the Periodontopathogenic Bacterium Porphyromonas gingivalis W83 , 2011, Journal of bacteriology.

[32]  R. Overbeek,et al.  The use of gene clusters to infer functional coupling. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[33]  P. Rouvière,et al.  High-density sampling of a bacterial operon using mRNA differential display. , 2001, Gene.

[34]  M. Springer,et al.  A competition mechanism regulates the translation of the Escherichia coli operon encoding ribosomal proteins L35 and L20. , 2008, Journal of molecular biology.

[35]  Michael McClelland,et al.  The Fur regulon in anaerobically grown Salmonella enterica sv. Typhimurium: identification of new Fur targets , 2011, BMC Microbiology.

[36]  Wing Hung Wong,et al.  Statistical inferences for isoform expression in RNA-Seq , 2009, Bioinform..

[37]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[38]  G. Storz,et al.  Regulatory RNAs in Bacteria , 2009, Cell.

[39]  Paul M. Sharp,et al.  Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes , 1986, Nucleic Acids Res..

[40]  Julio Collado-Vides,et al.  A powerful non-homology method for the prediction of operons in prokaryotes , 2002, ISMB.

[41]  Amy K. Schmid,et al.  Prevalence of transcription promoters within archaeal operons and coding sequences , 2009, Molecular systems biology.

[42]  Manju Bansal,et al.  PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes , 2011, BMC Research Notes.

[43]  B. Tjaden,et al.  Computational analysis of bacterial RNA-Seq data , 2013, Nucleic acids research.

[44]  David Page,et al.  A Bayesian Network Approach to Operon Prediction , 2003, Bioinform..

[45]  M. Suyama,et al.  Transcriptome Complexity in a Genome-Reduced Bacterium , 2009, Science.