Classification method for microarray probe selection using sequence, thermodynamics and secondary structure parameters

Probe design is the most important step for any microarray based assay. Accurate and efficient probe design and selection for the target sequence is critical in generating reliable and useful results. Several different approaches for probe design are reported in literature and an increasing number of bioinformatics tools are available for the same. However, based on the reported low accuracy, determining the hybridization efficiency of the probes is still a big computational challenge. Present study deals with the extraction of various novel features related to sequence composition, thermodynamics and secondary structure that may be essential for designing good probes. A feature selection method has been used to assess the relative importance of all these features. In this paper, we validate the importance of various features currently used for designing an oligonucleotide probe. Finally, a classification methodology is presented that can be used to predict the hybridization quality of a probe.

[1]  H. Blöcker,et al.  Predicting DNA duplex stability from the base sequence. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[2]  R. Blake,et al.  Thermal stability of DNA. , 1998, Nucleic acids research.

[3]  Kumar,et al.  Neural Networks a Classroom Approach , 2004 .

[4]  M. A. El Hassan,et al.  Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA. , 1996, Journal of molecular biology.

[5]  S. Majumder,et al.  Support vector machine for optical diagnosis of cancer. , 2005, Journal of biomedical optics.

[6]  Byoung-Tak Zhang,et al.  Microarray Probe Design Using epsilon-Multi-Objective Evolutionary Algorithms with Thermodynamic Criteria , 2006, EvoWorkshops.

[7]  Jude W. Shavlik,et al.  Evaluating machine learning approaches for aiding probe selection for gene-expression arrays , 2002, ISMB.

[8]  I. Brukner,et al.  Sequence‐dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. , 1995, The EMBO journal.

[10]  Paul Flicek,et al.  Optimized design and assessment of whole genome tiling arrays , 2007, ISMB/ECCB.

[11]  Kenneth A. Marx,et al.  Statistical mechanical simulation of polymeric DNA melting with MELTSIM , 1999, Bioinform..

[12]  J. SantaLucia,et al.  Improved nearest-neighbor parameters for predicting DNA duplex stability. , 1996, Biochemistry.

[13]  Pei-Chun Chang,et al.  Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis , 2006, BMC Bioinformatics.

[14]  David Page,et al.  Using Machine Learning to Design and Interpret Gene-Expression Microarrays , 2004, AI Mag..

[15]  R. Ornstein,et al.  An optimized potential function for the calculation of nucleic acid interaction energies I. Base stacking , 1978, Biopolymers.

[16]  A V Sivolob,et al.  Translational positioning of nucleosomes on DNA: the role of sequence-dependent isotropic DNA bending stiffness. , 1995, Journal of molecular biology.

[17]  N. Sugimoto,et al.  Improved thermodynamic parameters and helix initiation factor to predict stability of DNA duplexes. , 1996, Nucleic acids research.