In-silico Design of DNA Oligonucleotides: Challenges and Approaches

DNA oligonucleotides are essential components of a high number of technologies in molecular biology. The key event of each oligonucleotide-based assay is the specific binding between oligonucleotides and their target DNA. However, single-stranded DNA molecules also tend to bind to unintended targets or themselves. The probability of such unspecific binding increases with the complexity of an assay. Therefore, accurate data management and design workflows are necessary to optimize the in-silico design of primers and probes. Important considerations concerning computational infrastructure and run time need to be made for both data management and the design process. Data retrieval, data updates, storage, filtering and analysis are the main parts of a sequence data management system. Each part needs to be well-implemented as the resulting sequences form the basis for the oligonucleotide design. Important key features, such as the oligonucleotide length, melting temperature, secondary structures and primer dimer formation, as well as the specificity, should be considered for the in-silico selection of oligonucleotides. The development of an efficient oligonucleotide design workflow demands the right balance between the precision of the applied computer models, the general expenditure of time, and computational workload. This paper gives an overview of important parameters during the design process, starting from the data retrieval, up to the design parameters for optimized oligonucleotide design.

[1]  J. SantaLucia,et al.  Improved nearest-neighbor parameters for predicting DNA duplex stability. , 1996, Biochemistry.

[2]  H. Blöcker,et al.  Predicting DNA duplex stability from the base sequence. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[3]  K. Schleifer,et al.  ARB: a software environment for sequence data. , 2004, Nucleic acids research.

[4]  T. Lowe,et al.  General concepts for PCR primer design. , 1993, PCR methods and applications.

[5]  R. Scheuermann,et al.  Development of real-time PCR assays for the quantitative detection of Epstein-Barr virus and cytomegalovirus, comparison of TaqMan probes, and molecular beacons. , 2003, The Journal of molecular diagnostics : JMD.

[6]  Thomas Kämpke,et al.  Efficient primer design algorithms , 2001, Bioinform..

[7]  Jian Ye,et al.  Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction , 2012, BMC Bioinformatics.

[8]  Fred J Burpo,et al.  A critical review of PCR primer design algorithms and cross-hybridization case study , 2001 .

[9]  D. Turner,et al.  Predicting oligonucleotide affinity to nucleic acid targets. , 1999, RNA.

[10]  B. Faircloth,et al.  Primer3—new capabilities and interfaces , 2012, Nucleic acids research.

[11]  D. Turner,et al.  Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick base pairs. , 1998, Biochemistry.

[12]  Rolf Apweiler,et al.  The European Bioinformatics Institute’s data resources 2014 , 2013, Nucleic Acids Res..

[13]  W. Rychlik,et al.  A computer program for choosing optimal oligonucleotides for filter hybridization, sequencing and in vitro amplification of DNA. , 1989, Nucleic acids research.

[14]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[15]  J. SantaLucia,et al.  The thermodynamics of DNA structural motifs. , 2004, Annual review of biophysics and biomolecular structure.

[16]  I. Nazarenko,et al.  Effect of primary and secondary structure of oligodeoxyribonucleotides on the fluorescent properties of conjugated dyes. , 2002, Nucleic acids research.

[17]  Tatiana A. Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[18]  L. Chandler Chapter 21 – Sources of Errors in Molecular Testing , 2013 .

[19]  M. Bakhtiarizadeh,et al.  Versatility of different melting temperature (Tm) calculator software for robust PCR and real-time PCR oligonucleotide design: A practical guide , 2016 .

[20]  E. Harris,et al.  Antisense Oligonucleotides: Treatment Strategies and Cellular Internalization. , 2016, RNA & disease.

[21]  J. Shaffer,et al.  Hybridization of synthetic oligodeoxyribonucleotides to ΦX 174 DNA: the effect of single base pair mismatch , 1979 .

[22]  Biswanath Chowdhury,et al.  A review on multiple sequence alignment from the perspective of genetic algorithm. , 2017, Genomics.

[23]  Stephen A. Bustin,et al.  Real-Time PCR , 2005 .

[24]  David Haussler,et al.  The UCSC Genome Browser database: 2019 update , 2018, Nucleic Acids Res..

[25]  John SantaLucia,et al.  Physical principles and visual-OMP software for optimal PCR design. , 2007, Methods in molecular biology.

[26]  Fabian Sievers,et al.  Clustal Omega for making accurate alignments of many protein sequences , 2018, Protein science : a publication of the Protein Society.

[27]  Erik L. L. Sonnhammer,et al.  Kalign – an accurate and fast multiple sequence alignment algorithm , 2005, BMC Bioinformatics.

[28]  Yi Yang,et al.  MFEprimer-2.0: a fast thermodynamics-based program for checking PCR primer specificity , 2012, Nucleic Acids Res..

[29]  C. Levenson,et al.  Effects of primer-template mismatches on the polymerase chain reaction: human immunodeficiency virus type 1 model studies. , 1990, Nucleic acids research.

[30]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[31]  Leslie A. Taylor,et al.  Predicting antisense oligonucleotide inhibitory efficacy: a computational approach using histograms and thermodynamic indices , 1992, Nucleic Acids Res..

[32]  J. SantaLucia,et al.  Thermodynamics and NMR of internal G.T mismatches in DNA. , 1997, Biochemistry.

[33]  Anubha Mahajan,et al.  Oligonucleotide properties determination and primer designing: a critical examination of predictions , 2005, Bioinform..

[34]  M. Israel,et al.  A rapid method for detecting and mapping homology between heterologous DNAs. Evaluation of polyomavirus genomes. , 1979, The Journal of biological chemistry.

[35]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[36]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[37]  Colas Schretter,et al.  Oligonucleotide Design by Multilevel Optimization , 2005 .

[38]  J. SantaLucia,et al.  Thermodynamic parameters for DNA sequences with dangling ends. , 2000, Nucleic acids research.

[39]  G. Steger,et al.  Thermal denaturation of double-stranded nucleic acids: prediction of temperatures critical for gradient gel electrophoresis and polymerase chain reaction. , 1994, Nucleic acids research.

[40]  Stephan Pabinger,et al.  Oli2go: an automated multiplex oligonucleotide design tool , 2018, Nucleic Acids Res..

[41]  V. Adam,et al.  Microarray analysis of metallothioneins in human diseases--A review. , 2016, Journal of pharmaceutical and biomedical analysis.

[42]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[43]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[44]  B. Rehm,et al.  Bioinformatic tools for DNA/protein sequence analysis, functional assignment of genes and protein classification , 2001, Applied Microbiology and Biotechnology.

[45]  Pelin Yilmaz,et al.  The SILVA ribosomal RNA gene database project: improved data processing and web-based tools , 2012, Nucleic Acids Res..

[46]  N. Sugimoto,et al.  Thermodynamic parameters to predict stability of RNA/DNA hybrid duplexes. , 1995, Biochemistry.

[47]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[48]  V. Marx Biology: The big challenges of big data , 2013, Nature.

[49]  Flavio Romano,et al.  Introducing improved structural properties and salt dependence into a coarse-grained model of DNA. , 2015, The Journal of chemical physics.

[50]  P. Vallone,et al.  Predicting sequence-dependent melting stability of short duplex DNA oligomers. , 1997, Biopolymers.

[51]  Burkhard Morgenstern,et al.  DIALIGN at GOBICS—multiple sequence alignment using various sources of external information , 2013, Nucleic Acids Res..

[52]  Christopher M. Sullivan,et al.  The Personal Sequence Database: a suite of tools to create and maintain web-accessible sequence databases , 2007, BMC Bioinformatics.

[53]  Toshihisa Takagi,et al.  DNA Data Bank of Japan , 2016, Nucleic Acids Res..

[54]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[55]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[56]  D. Turner,et al.  Improved free-energy parameters for predictions of RNA duplex stability. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[57]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[58]  Naoki Sugimoto,et al.  Immobilized small deoxyribozyme to distinguish RNA secondary structures. , 2002, Biochemistry.

[59]  C. Wittwer,et al.  Rapid β-Globin Genotyping by Multiplexing Probe Melting Temperature and Color , 2000 .

[60]  J. SantaLucia,et al.  Nearest-neighbor thermodynamics and NMR of DNA sequences with internal A.A, C.C, G.G, and T.T mismatches. , 1999, Biochemistry.

[61]  Raymond Lo,et al.  CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database , 2016, Nucleic Acids Res..

[62]  E. Wright,et al.  Mathematical tools to optimize the design of oligonucleotide probes and primers , 2014, Applied Microbiology and Biotechnology.

[63]  P. Doty,et al.  Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. , 1962, Journal of molecular biology.

[64]  Rodrigo Lopez,et al.  The EBI search engine: EBI search as a service—making biological data accessible for all , 2017, Nucleic Acids Res..

[65]  Sung-Gyu Park,et al.  Culture-Free Detection of Bacterial Pathogens on Plasmonic Nanopillar Arrays Using Rapid Raman Mapping. , 2018, ACS applied materials & interfaces.

[66]  Michael D. Dumas,et al.  ThermoAlign: a genome-aware primer design tool for tiled amplicon resequencing , 2017, Scientific Reports.

[67]  Rosaleen J. Anderson,et al.  Methods for the detection and identification of pathogenic bacteria: past, present, and future. , 2017, Chemical Society reviews.

[68]  H. Frickmann,et al.  Fluorescence in situ hybridization (FISH) in the microbiological diagnostic routine laboratory: a review , 2017, Critical reviews in microbiology.

[69]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[70]  Roger E Bumgarner Overview of DNA microarrays: types, applications, and their future. , 2013, Current protocols in molecular biology.

[71]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[72]  N. Sugimoto,et al.  Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes. , 1996, Nucleic acids research.

[73]  J. SantaLucia,et al.  A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Wen Wang,et al.  MPprimer: a program for reliable multiplex PCR primer design , 2010, BMC Bioinformatics.

[75]  Michael C. Schatz,et al.  Cloud Computing and the DNA Data Race , 2010, Nature Biotechnology.

[76]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.