Discovering well-ordered folding patterns in nucleotide sequences

MOTIVATION Growing evidence demonstrates that local well-ordered structures are closely correlated with cis-acting elements in the post-transcriptional regulation of gene expression. The prediction of a well-ordered folding sequence (WFS) in genomic sequences is very helpful in the determination of local RNA elements with structure-dependent functions in mRNAs. RESULTS In this study, the quality of local WFS is assessed by the energy difference (E(diff)) between the free energies of the global minimal structure folded in the segment and its corresponding optimal restrained structure (ORS). The ORS is an optimal structure under the condition in which none of the base-pairs in the global minimal structure is allowed to form. Those WFSs in HIV-1 mRNA, various ferritin mRNAs and genomic sequences containing let-7 RNA gene were searched by a novel method, ed_scan. Our results indicate that the detected WFSs are coincident with known Rev response element in HIV-1 mRNA, iron-responsive elements in ferritin mRNAs and small let-7 RNAs in Caenorhabditis elegans, Caenorhabditis briggsae and Drosophila melanogaster genomic sequences. Statistical significance of the WFS is addressed by a quantitative measure Zscr(e) that is a z-score of E(diff) and extensive random simulations. We suggest that WFSs with high statistical significance have structural roles involving their sequence information. AVAILABILITY The source code of ed_scan is available via anonymous ftp as ftp://ftp.ncifcrf.gov/pub/users/shuyun/scan/ed_scan.tar.