Accelerated off-target search algorithm for siRNA

MOTIVATION Designing highly effective short interfering RNA (siRNA) sequences with maximum target-specificity for mammalian RNA interference (RNAi) is one of the hottest topics in molecular biology. The relationship between siRNA sequences and RNAi activity has been studied extensively to establish rules for selecting highly effective sequences. However, there is a pressing need to compute siRNA sequences that minimize off-target silencing effects efficiently and to match any non-targeted sequences with mismatches. RESULTS The enumeration of potential cross-hybridization candidates is non-trivial, because siRNA sequences are short, ca. 19 nt in length, and at least three mismatches with non-targets are required. With at least three mismatches, there are typically four or five contiguous matches, so that a BLAST search frequently overlooks off-target candidates. By contrast, existing accurate approaches are expensive to execute; thus we need to develop an accurate, efficient algorithm that uses seed hashing, the pigeonhole principle, and combinatorics to identify mismatch patterns. Tests show that our method can list potential cross-hybridization candidates for any siRNA sequence of selected human gene rapidly, outperforming traditional methods by orders of magnitude in terms of computational performance. AVAILABILITY http://design.RNAi.jp CONTACT yamada@cb.k.u-tokyo.ac.jp.

[1]  Erik L L Sonnhammer,et al.  Improved and automated prediction of effective siRNA. , 2004, Biochemical and biophysical research communications.

[2]  Gary D. Stormo,et al.  Selection of optimal DNA oligos for gene expression arrays , 2001, Bioinform..

[3]  Alexander Schliep,et al.  Selecting signature oligonucleotides to identify organisms using DNA arrays , 2002, Bioinform..

[4]  R. Bernards,et al.  Stable suppression of tumorigenicity by virus-mediated RNA interference. , 2002, Cancer cell.

[5]  Ricardo A. Baeza-Yates,et al.  Fast and Practical Approximate String Matching , 1996, Inf. Process. Lett..

[6]  A. Reynolds,et al.  Rational siRNA design for RNA interference , 2004, Nature Biotechnology.

[7]  Wing-Kin Sung,et al.  Fast and accurate probe selection algorithm for large genomes , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[8]  T. Tuschl,et al.  Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells , 2001, Nature.

[9]  Karl R. Abrahamson Generalized String Matching , 1987, SIAM J. Comput..

[10]  Natasha Levenkova,et al.  Gene specific siRNA selector , 2004, Bioinform..

[11]  TOMOYUKI YAMADA,et al.  Computing Highly Specific and Noise-tolerant Oligomers Efficiently , 2004, J. Bioinform. Comput. Biol..

[12]  Jean-Marie Rouillard,et al.  OligoArray: genome-scale oligonucleotide design for microarrays , 2002, Bioinform..

[13]  Carole L Yauk,et al.  Comprehensive comparison of six microarray technologies. , 2004, Nucleic acids research.

[14]  Sven Rahmann,et al.  Rapid large-scale oligonucleotide selection for microarrays , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[15]  T. Du,et al.  Asymmetry in the Assembly of the RNAi Enzyme Complex , 2003, Cell.

[16]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[17]  B. Li,et al.  Expression profiling reveals off-target gene regulation by RNAi , 2003, Nature Biotechnology.

[18]  Tomoyuki Yamada,et al.  siDirect: highly effective, target-specific siRNA design software for mammalian RNA interference , 2004, Nucleic Acids Res..

[19]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[20]  Moshe Lewenstein,et al.  Faster algorithms for string matching with k mismatches , 2000, SODA '00.

[21]  Tao Jiang,et al.  Efficient Selection of Unique and Popular Oligos for Large EST Databases , 2003, CPM.

[22]  WangLuquan,et al.  A Web-based design center for vector-based siRNA and siRNA cassette , 2004 .

[23]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[24]  Peter Webb,et al.  Designing Specific Oligonucleotide Probes for the Entire S. cerevisiae Transcriptome , 2002, WABI.

[25]  Ola Snøve,et al.  Many commonly used siRNAs risk off-target activity. , 2004, Biochemical and biophysical research communications.

[26]  K. Ui-Tei,et al.  Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference. , 2004, Nucleic acids research.

[27]  Luquan Wang,et al.  A Web-based design center for vector-based siRNA and siRNA cassette , 2004, Bioinform..

[28]  Esko Ukkonen,et al.  Approximate String Matching with q-grams and Maximal Matches , 1992, Theor. Comput. Sci..

[29]  Haibin Xia,et al.  Allele-specific silencing of dominant disease genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.