New, Improved, and Practical k-Stem Sequence Similarity Measures for Probe Design

We define new measures of sequence similarity for oligonucleotide probe design. These new measures incorporate the nearest neighbor k-stem motifs in their definition, but can be efficiently computed by means of a bit-vector method. They are not as computationally costly as algorithms that predict nearest neighbor hybridization potential. Our new measures for sequence similarity correlate significantly better with nearest neighbor thermodynamic predictions than either BLAST or the standard edit or insertion-deletion defined similarities already in use in many different probe design applications.

[1]  Arkadii G. D'yachkov,et al.  A Weighted Insertion-Deletion Stacked Pair Thermodynamic Metric for DNA Codes , 2004, DNA.

[2]  Anne Condon,et al.  RNAsoft: a suite of RNA secondary structure prediction and design software tools , 2003, Nucleic Acids Res..

[3]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[4]  J. SantaLucia,et al.  The thermodynamics of DNA structural motifs. , 2004, Annual review of biophysics and biomolecular structure.

[5]  Jizhong Zhou,et al.  Selection of optimal oligonucleotide probes for microarrays using multiple criteria, global alignment and parameter estimation , 2005, Nucleic acids research.

[6]  Maxime Crochemore,et al.  A fast and practical bit-vector algorithm for the Longest Common Subsequence problem , 2001, Inf. Process. Lett..

[7]  Susan Groshen,et al.  Comparisons of substitution, insertion and deletion probes for resequencing and mutational analysis using oligonucleotide microarrays , 2005, Nucleic acids research.

[8]  Eugene W. Myers,et al.  A fast bit-vector algorithm for approximate string matching based on dynamic programming , 1998, JACM.

[9]  Yu Ching Chang,et al.  IMPORT - Integrated Massive Probe's Optimal Recognition Tools , 2003 .

[10]  Jan Barciszewski,et al.  RNA Biochemistry and Biotechnology , 1999 .

[11]  D. Tautz,et al.  Oligonucleotide microarrays: widely applied--poorly understood. , 2007, Briefings in functional genomics & proteomics.

[12]  P. S. White,et al.  Flow cytometry-based minisequencing: a new platform for high-throughput single-nucleotide polymorphism scoring. , 2000, Genomics.

[13]  Robert P. Searles,et al.  DNA multiplex hybridization on microarrays and thermodynamic stability in solution: a direct comparison , 2007, Nucleic acids research.

[14]  Trevor I. Dix,et al.  A Bit-String Longest-Common-Subsequence Algorithm , 1986, Inf. Process. Lett..

[15]  Michael Zuker,et al.  Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide , 1999 .

[16]  D. Tautz,et al.  Tests of rRNA hybridization to microarrays suggest that hybridization characteristics of oligonucleotide probes for species discrimination cannot be predicted , 2006, Nucleic Acids Research.

[17]  Chunlei Wu,et al.  Free energy of DNA duplex formation on short oligonucleotide microarrays , 2006, Nucleic acids research.

[18]  Eric K. Nordberg,et al.  YODA: selecting signature oligonucleotides , 2005, Bioinform..

[19]  J. SantaLucia,et al.  A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Alexander Schliep,et al.  Group testing with DNA chips: generating designs and decoding experiments , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[21]  Gary D. Stormo,et al.  Selection of optimal DNA oligos for gene expression arrays , 2001, Bioinform..

[22]  Arkadii G. D'yachkov,et al.  New t-Gap Insertion-Deletion-Like Metrics for DNA Hybridization Thermodynamic Modeling , 2006, J. Comput. Biol..

[23]  Jiasen Lu,et al.  Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. , 2000, Nucleic acids research.

[24]  Ayumi Shinohara,et al.  New Bit-Parallel Indel-Distance Algorithm , 2005, WEA.

[25]  Arkadii G. D'yachkov,et al.  On DNA Codes , 2005, Probl. Inf. Transm..

[26]  Lars Kaderali,et al.  Primer-design for multiplexed genotyping. , 2003, Nucleic acids research.