Detecting sequence dependent transcriptional pauses from RNA and protein number time series

BackgroundEvidence suggests that in prokaryotes sequence-dependent transcriptional pauses affect the dynamics of transcription and translation, as well as of small genetic circuits. So far, a few pause-prone sequences have been identified from in vitro measurements of transcription elongation kinetics.ResultsUsing a stochastic model of gene expression at the nucleotide and codon levels with realistic parameter values, we investigate three different but related questions and present statistical methods for their analysis. First, we show that information from in vivo RNA and protein temporal numbers is sufficient to discriminate between models with and without a pause site in their coding sequence. Second, we demonstrate that it is possible to separate a large variety of models from each other with pauses of various durations and locations in the template by means of a hierarchical clustering and a random forest classifier. Third, we introduce an approximate likelihood function that allows to estimate the location of a pause site.ConclusionsThis method can aid in detecting unknown pause-prone sequences from temporal measurements of RNA and protein numbers at a genome-wide scale and thus elucidate possible roles that these sequences play in the dynamics of genetic networks and phenotype.

[1]  Antti Häkkinen,et al.  Effects of Transcriptional Pausing on Gene Expression Dynamics , 2010, PLoS Comput. Biol..

[2]  F. Schmidt,et al.  Transcript hairpin structures are not required for RNA polymerase pausing in the gene encoding the E. coli RNase P RNA, M1 RNA , 1991, FEBS letters.

[3]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[4]  Arkady B. Khodursky,et al.  Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[5]  David L. Wheeler,et al.  GenBank: update , 2004, Nucleic Acids Res..

[6]  Paul J. Choi,et al.  Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells , 2010, Science.

[7]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[8]  Ignacio Tinoco,et al.  Following translation by single ribosomes one codon at a time , 2008, Nature.

[9]  Michelle D. Wang,et al.  A single-molecule technique to study sequence-dependent transcription pausing. , 2004, Biophysical journal.

[10]  Eric R. Ziegel,et al.  Analysis of Financial Time Series , 2002, Technometrics.

[11]  竹安 数博,et al.  Time series analysis and its applications , 2007 .

[12]  D. Gillespie Exact Stochastic Simulation of Coupled Chemical Reactions , 1977 .

[13]  Charles S Peskin,et al.  A look-ahead model for the elongation dynamics of transcription. , 2009, Biophysical journal.

[14]  C. Kurland,et al.  Processivity errors of gene expression in Escherichia coli. , 1990, Journal of molecular biology.

[15]  Andre S. Ribeiro,et al.  SGN Sim, a Stochastic Genetic Networks Simulator , 2007, Bioinform..

[16]  Antti Häkkinen,et al.  Dynamical effects of transcriptional pause-prone sites , 2010, Comput. Biol. Chem..

[17]  Michelle D. Wang,et al.  Mechanochemical kinetics of transcription elongation. , 2007, Physical review letters.

[18]  L. Mcquitty Similarity Analysis by Reciprocal Pairs for Discrete and Continuous Data , 1966 .

[19]  Richard L. Smith,et al.  Essentials of Statistical Inference: Index , 2005 .

[20]  Steven M. Block,et al.  Sequence-Resolved Detection of Pausing by Single RNA Polymerase Molecules , 2006, Cell.

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[23]  Rui Zhu,et al.  Validation of an algorithm for delay stochastic simulation of transcription and translation in prokaryotic gene expression , 2006, Physical biology.

[24]  Robert T Sauer,et al.  Ribosome rescue: tmRNA tagging activity and capacity in Escherichia coli , 2005, Molecular microbiology.

[25]  Ruey S. Tsay,et al.  Analysis of Financial Time Series: Tsay/Analysis of Financial Time Series , 2005 .

[26]  R. Landick The regulatory roles and mechanism of transcriptional pausing. , 2006, Biochemical Society transactions.

[27]  H. Bujard,et al.  Dissecting the functional program of Escherichia coli promoters: the combined mode of action of Lac repressor and AraC activator. , 2001, Nucleic acids research.

[28]  E. Cox,et al.  Real-Time Kinetics of Gene Activity in Individual Bacteria , 2005, Cell.

[29]  Y. Pawitan In all likelihood : statistical modelling and inference using likelihood , 2002 .

[30]  X. Xie,et al.  Probing Gene Expression in Live Cells, One Protein Molecule at a Time , 2006, Science.

[31]  Maliha S. Nash,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 2001, Technometrics.

[32]  Michelle D. Wang,et al.  Single molecule analysis of RNA polymerase elongation reveals uniform kinetic behavior , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[33]  M. Chamberlin,et al.  Pausing and termination of transcription within the early region of bacteriophage T7 DNA in vitro. , 1981, The Journal of biological chemistry.

[34]  Antti Häkkinen,et al.  Delayed Stochastic Model of Transcription at the Single Nucleotide Level , 2009, J. Comput. Biol..

[35]  C. Chan,et al.  Discontinuous movements of DNA and RNA in RNA polymerase accompany formation of a paused transcription complex , 1995, Cell.

[36]  M. Sørensen,et al.  Absolute in vivo translation rates of individual codons in Escherichia coli. The two glutamic acid codons GAA and GAG are translated with a threefold difference in rate. , 1991, Journal of molecular biology.

[37]  Olli Yli-Harja,et al.  Stochastic sequence-level model of coupled transcription and translation in prokaryotes , 2011, BMC Bioinformatics.

[38]  S. Greive,et al.  Thinking quantitatively about transcriptional regulation , 2005, Nature Reviews Molecular Cell Biology.

[39]  G. Dougan,et al.  Cooperation Between Translating Ribosomes and RNA Polymerase in Transcription Elongation , 2010, Science.

[40]  Andre S Ribeiro,et al.  Studying genetic regulatory networks at the molecular level: delayed reaction stochastic models. , 2007, Journal of theoretical biology.

[41]  Colin Rose Computational Statistics , 2011, International Encyclopedia of Statistical Science.

[42]  Sarah E. Walker,et al.  Ribosomal translocation: one step closer to the molecular mechanism. , 2009, ACS chemical biology.

[43]  Rui Zhu,et al.  A General Modeling Strategy for Gene Regulatory Networks with Stochastic Dynamics , 2006, J. Comput. Biol..

[44]  Vitaly Epshtein,et al.  Cooperation Between RNA Polymerase Molecules in Transcription Elongation , 2003, Science.

[45]  Anirvan M. Sengupta,et al.  Thermodynamic and kinetic modeling of transcriptional pausing. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[46]  P. V. von Hippel,et al.  Monitoring RNA transcription in real time by using surface plasmon resonance , 2008, Proceedings of the National Academy of Sciences.

[47]  Antti Häkkinen,et al.  In vivo kinetics of transcription initiation of the lar promoter in Escherichia coli. Evidence for a sequential mechanism with two rate-limiting steps , 2011, BMC Systems Biology.

[48]  M. Lidstrom,et al.  The role of physiological heterogeneity in microbial population behavior. , 2010, Nature chemical biology.

[49]  Kirsten Jung,et al.  Timing and dynamics of single cell gene expression in the arabinose utilization system. , 2008, Biophysical journal.

[50]  H. Riezman,et al.  Transcription and translation initiation frequencies of the Escherichia coli lac operon. , 1977, Journal of molecular biology.

[51]  Elio A. Abbondanzieri,et al.  Ubiquitous Transcriptional Pausing Is Independent of RNA Polymerase Backtracking , 2003, Cell.

[52]  C. Bustamante,et al.  Single-molecule study of transcriptional pausing and arrest by E. coli RNA polymerase. , 2000, Science.

[53]  P. V. von Hippel,et al.  Multiple RNA polymerase conformations and GreA: control of the fidelity of transcription. , 1993, Science.

[54]  G. Glazko,et al.  Network biology: a direct approach to study biological function , 2011, Wiley interdisciplinary reviews. Systems biology and medicine.

[55]  K. Keiler Biology of trans-translation. , 2008, Annual review of microbiology.

[56]  Heping Zhang,et al.  Recursive Partitioning and Applications , 1999 .

[57]  Robert Landick,et al.  The flap domain is required for pause RNA hairpin inhibition of catalysis by RNA polymerase and can modulate intrinsic termination. , 2003, Molecular cell.

[58]  A. Dasgupta Asymptotic Theory of Statistics and Probability , 2008 .

[59]  A. C. Davison,et al.  Statistical models: Name Index , 2003 .

[60]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[61]  Kim Sneppen,et al.  Ribosome collisions and translation efficiency: optimization by codon usage and mRNA destabilization. , 2008, Journal of molecular biology.