SPARCS: a web server to analyze (un)structured regions in coding RNA sequences

More than a simple carrier of the genetic information, messenger RNA (mRNA) coding regions can also harbor functional elements that evolved to control different post-transcriptional processes, such as mRNA splicing, localization and translation. Functional elements in RNA molecules are often encoded by secondary structure elements. In this aticle, we introduce Structural Profile Assignment of RNA Coding Sequences (SPARCS), an efficient method to analyze the (secondary) structure profile of protein-coding regions in mRNAs. First, we develop a novel algorithm that enables us to sample uniformly the sequence landscape preserving the dinucleotide frequency and the encoded amino acid sequence of the input mRNA. Then, we use this algorithm to generate a set of artificial sequences that is used to estimate the Z-score of classical structural metrics such as the sum of base pairing probabilities and the base pairing entropy. Finally, we use these metrics to predict structured and unstructured regions in the input mRNA sequence. We applied our methods to study the structural profile of the ASH1 genes and recovered key structural elements. A web server implementing this discovery pipeline is available at http://csb.cs.mcgill.ca/sparcs together with the source code of the sampling algorithm.

[1]  A. Krogh,et al.  No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution. , 1999, Nucleic acids research.

[2]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[3]  S. Altschul,et al.  Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. , 1985, Molecular biology and evolution.

[4]  Howard Y. Chang,et al.  RNA SHAPE analysis in living cells. , 2013, Nature chemical biology.

[5]  Michael Drmota,et al.  Systems of functional equations , 1997, Random Struct. Algorithms.

[6]  Rolf Backofen,et al.  Global or local? Predicting secondary structure and accessibility in mRNAs , 2012, Nucleic acids research.

[7]  M. Blanchette,et al.  Detecting non-coding selective pressure in coding regions , 2007, BMC Evolutionary Biology.

[8]  Manolis Kellis,et al.  Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes. , 2011, Genome research.

[9]  H. Wilf A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects , 1977 .

[10]  Aleksey Y. Ogurtsov,et al.  A periodic pattern of mRNA secondary structure created by the genetic code , 2006, Nucleic acids research.

[11]  Yann Ponty,et al.  Controlled non uniform random generation of decomposable structures , 2010, Theor. Comput. Sci..

[12]  Kim Nasmyth,et al.  ASH1 mRNA localization in yeast involves multiple secondary structural elementsand Ash1 protein translation , 1999, Current Biology.

[13]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[14]  Eran Segal,et al.  Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes , 2008, Proceedings of the National Academy of Sciences.

[15]  Kristen K. Dang,et al.  Architecture and Secondary Structure of an Entire HIV-1 RNA Genome , 2009, Nature.

[16]  Yann Ponty,et al.  Multi-dimensional Boltzmann Sampling of Languages , 2010, 1002.0046.

[17]  Yann Ponty,et al.  An Unbiased Adaptive Sampling Algorithm for the Exploration of RNA Mutational Landscapes Under Evolutionary Pressure , 2011, J. Comput. Biol..

[18]  Howard Y. Chang,et al.  Genome-wide measurement of RNA secondary structure in yeast , 2010, Nature.

[19]  Peter F. Stadler,et al.  Local RNA base pairing probabilities in large sequences , 2006, Bioinform..

[20]  W. Fitch Random sequences. , 1983, Journal of molecular biology.

[21]  Robert H Singer,et al.  Asymmetric sorting of ash1p in yeast results from inhibition of translation by localization elements in the mRNA. , 2002, Molecular cell.

[22]  C. Burge,et al.  Widespread selection for local RNA secondary structure in coding regions of bacterial genes. , 2003, Genome research.